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1 The meaning and expression of definiteness 


Definiteness has been a central topic in theoretical semantics since its modern 
foundation. Two main lines of thought have classically debated about the proper 
analysis of definite noun phrases. One of them, initiated by Frege (1892), Russell 
(1905), and Strawson (1950), argues that definite descriptions crucially involve 
the condition — be it asserted or presupposed - that their descriptive content 
is satisfied by a unique entity (in the relevant context of use). The other line of 
thought, originally proposed by Christophersen (1939), but elaborated by Heim 
(1982) and Kamp (1981), claims that the core of definiteness depends on the exis- 
tence of a referent in the common ground known by the speaker and the hearer. 
Most of the contemporary approaches to definiteness opt for either uniqueness 
(e.g. Hawkins 1978; Kadmon 1990; Hawkins 1991; Abbott 1999) or familiarity (e.g. 
Green 1996; Chafe 1996), although there are other studies that point out that nei- 
ther approach by itself provides a satisfactory explanation for all the empirical 
data concerning the use of definite descriptions in English (e.g. Birner & Ward 
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1994). These findings direct to a third standpoint that defends that the seman- 
tic basis of definiteness lies in a different characteristic, such as salience (Lewis 
1979) or identifiability (Birner & Ward 1994). Another stance combines the two 
first "classical" approaches and claims that both uniqueness and familiarity are 
needed to explain the empirical behavior of the English definite article (Farkas 
2002; Roberts 2003). 

The theoretical discussion on definiteness has been revisited more recently by 
Schwarz (2009; 2013) and Coppock & Beaver (2015). In investigating the expres- 
sion of definiteness in different languages, Schwarz proposes that, in order to 
account for the semantic value of definite descriptions crosslinguistically, both 
familiarity and uniqueness are needed. In some languages, moreover, they even 
correspond to different forms of definite markers that can be dubbed, respec- 
tively, "strong" and ^weak" definite articles. When such semantic division of la- 
bor is explicit, the uniqueness component is often encoded by a bare noun phrase 
or by a silent determiner (Arkoh & Matthewson 2013). Coppock & Beaver (2015) 
also analyze definiteness into two main components: uniqueness and determi- 
nacy. Definiteness marking is seen as a morphological category that triggers a 
uniqueness presupposition, while determinacy consists in referring to an indi- 
vidual (i.e. having a type e denotation). Definite descriptions are argued to be 
fundamentally predicative, presupposing uniqueness but not existence, and to 
acquire existential import through general type-shifting operations (Partee 1986). 
Type-shifters enable argumental definite descriptions to become either determi- 
nate (and thus denote an individual) or indeterminate (and thus function as an 
existential quantifier). 

The study ofthe meaning and expression of definiteness has not only advanced 
our understanding of regular definite noun phrases, that is to say, constituents 
that refer to ordinary individuals, like the one exemplified in (1a). Other inter- 
pretations, like generic definites (1b), weak definites (1c) and superlatives (1d), 
allegedly involve reference to non-ordinary objects or individuals, and yet in 
languages like English they are associated with the presence of a definiteness 
marker. 


(1) a. Hopefully, people will go out and start looking at the moon today. 


a 
b. The potato genome contains twelve chromosomes. 
When do babies go to the dentist for their first visit? 
Donald owns the highest building in New York. 


D 


œ 
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These “non-ordinary” definite descriptions have been discussed in the litera- 
ture, for example: generic definites are analyzed in Chierchia (1998), Dayal (2004), 
Krifka (2003), Farkas & de Swart (2007) and Borik & Espinal (2012); weak defi- 
nites have been the main topic in Carlson & Sussman (2005), Aguilar-Guevara & 
Zwarts (2011; 2013), Schwarz (2014) and Zwarts (2014); while superlatives have 
been treated by Szabolcsi (1986), Hackl (2009), Sharvit & Stateva (2002), Krasi- 
kova (2012) and Coppock & Beaver (2014). 

Definiteness has also awakened the interest of generative syntacticians. The 
common assumption for languages with articles is that these correspond to the 
heads of determiner projections (DP). In contrast, the opinions about article-less 
languages are divided. Some authors, following the Universal DP approach, as- 
sume that a DP is present in all languages, regardless of whether or not they 
have an overt definite article (e.g. Cinque 1994; Longobardi 1994). This means 
that bare nouns with a definite interpretation in article-less languages have a 
definite article, the D-head, which is unpronounced. Other authors, following 
the DP/NP approach, propose that not all nominal arguments correspond to DPs 
and that some languages might lack the category D altogether. On this view, the 
lack of an article indicates the absence of a DP (e.g. Baker 2003; Bošković 2008); 
therefore, a basically predicative category like NP is capable of referring to indi- 
viduals by means of type-shifting operations. There is a particular type-shifter, ı, 
which would be responsible for the definite interpretation of noun phrases with 
no articles or overt markers for definiteness (Chierchia 1998; Dayal 2004). 

Moreover, definiteness marking, although usually encoded by determiners or 
particles in the adnominal domain, might be expressed in different syntactic pro- 
jections, for instance, in bare classifier phrases. Cheng & Sybesma (1999) claim 
that in languages like Cantonese and Mandarin Chinese the classifier head pro- 
vides the definiteness meaning - when no numeral is present. Simpson et al. 
(2011) study bare classifier definites in other languages (Vietnamese, Hmong, 
Bangla) and confirm the presence of this pattern, although the fact that also 
bare nouns may receive definite interpretations calls into question that classi- 
fiers have incorporated the definiteness feature into their meaning in all such 
languages. The whole extent of this panorama of definiteness marking in cate- 
gories other than D has not yet been acknowledged. 

Despite its theoretical significance, there has been surprisingly scarce research 
on the cross-linguistic expression of definiteness. One of the few examples of 
this kind of approach are the works of Dryer (2005; 2013; 2014), which regis- 
ter the different patterns that languages show regarding the occurrence of def- 
inite articles and their formal similarity with demonstratives. Another example 
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is Givón (1978), who discusses how the contrast between definiteness and indef- 
initeness, on the one hand, and referentiality vs. non referentiality (genericity), 
on the other hand, are mapped crosslinguistically. Even with the valuable contri- 
bution of these studies, our knowledge on definiteness across languages still calls 
for a deeper typological understanding of the syntax of definite noun phrases as 
well as of the whole range of their possible interpretations. 

With the purpose of contributing to filling this gap, the present volume gath- 
ers a collection of studies exploiting insights from formal semantics and syntax, 
typological and language specific studies, and, crucially, semantic fieldwork and 
cross-linguistic semantics, in order to address the expression and interpretation 
of definiteness in a diverse group of languages, most of them understudied. 

The papers presented in this volume aim to establish a dialogue between the- 
ory and data. In doing so, they adhere to a general guideline: theories are used 
to make predictions about how definiteness is expressed in particular languages 
and what kind of semantic components it is expected to display. Theoretical pre- 
dictions determine - among other things - in which contexts of use a purported 
definite expression will be acceptable and in which contexts it is likely to be re- 
jected. These predictions are confronted with empirical data not only to test the 
adequacy of current theories, but also to bring along more questions about the 
possible diversity of meanings attested and their corresponding forms of expres- 
sion. 

One of the goals of cross-linguistic comparison is to find patterns that are con- 
stant across languages and to identify those that are subject to variation. This is 
what, ultimately, brings together the interests of linguists willing to contribute 
to a comprehensive panorama of a particular phenomenon explored in a diverse 
pool of particular language systems. This practice has a long and reputable tradi- 
tion in practically all fields of linguistics, but studies in the semantics of the nom- 
inal domain, especially from the formal perspective, only recently turned into 
this direction, starting with the seminal work of Bach et al. (1995) on quantifica- 
tion. More research from this standpoint has followed, like the works collected in 
Matthewson (2008), Keenan & Paperno (2012), to mention only some of the most 
emblematic. It is to this line of work that the present volume seeks to contribute. 
Given that we can safely assume that all languages are capable of making defi- 
nite reference and that, therefore, there must be a way in every language to refer 
to particular individuals which are assumed to be known to speaker and hearer, 
or which are assumed to be unique in the relevant context of a speech-act, the 
task is to determine how they do it and which other semantic phenomena are 
associated with definiteness marking. 
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With these antecedents in mind, we can now sum up the main questions that 
tie together the papers in this volume: What formal strategies do natural lan- 
guages employ to encode definiteness? What are the possible meanings associ- 
ated to this notion across languages? Are there different types of definite refer- 
ence? Which other functions (besides marking definite reference) are associated 
with definite descriptions? In this spirit, each of the papers contained in this 
volume addresses at least one of these questions and, in doing so, we believe 
they enrich our understanding of definiteness and with it, they contribute to our 
knowledge of the human capacity of language in general. 


2 Overview of the volume 


This volume is composed of thirteen papers plus the editors' introduction. As 
mentioned above, the unifying factor among them is, on the one hand, the aim 
to contribute to a better understanding of how definiteness is expressed and how 
definite descriptions are interpreted in natural languages and, on the other hand, 
the fact that authors combine theory and first-hand data in order to arrive to new 
insights about this classical subject. 

The contributions are organized around three main overarching topics or ques- 
tions. The first group of papers (Schwarz, Cisneros, Sereikaite, Irani, Pico, and Le 
Bruyn) addresses the topic of how definiteness is encoded in natural languages 
and which basic semantic features are involved in its expression. The second 
group of papers focuses on what is the syntactic locus of definiteness and what 
is the relation between definiteness marking and other projections (besides D) in 
the nominal domain. This question brings together the works of Hall, Despić and 
Borik & Espinal. Finally, the third group of papers (which include Williams, de 
Sá et al., Coppock & Strand, and Etxeberria & Giannakidou) deals with construc- 
tions in which definiteness markers seem to be associated to functions or mean- 
ings beyond canonical definite reference. In the next paragraphs, we present a 
brief overview of each of the aforementioned contributions. 

Florian Schwarz's paper "Weak and strong definite articles: Meaning and form 
across languages" revisits the contrast between two types of definite descriptions 
on the light of new data drawn from a number of different languages (Hausa, 
Lakhota, Mauritian Creole, Haitian Creole, among others). According to his pre- 
vious findings (Schwarz 2009), some languages differentiate overtly between def- 
inite descriptions referring to entities that are unique - relative to some domain 
- and definites that refer to entities that have been previously mentioned in dis- 
course. Unique definites are called weak, while familiar (anaphoric) definites are 
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considered strong. There is an interesting pattern found across languages that 
show this distinction: ^weak" definites may be overtly marked or not marked at 
all, but in any case, their marker is morphophonologically less robust than the 
"strong" marker. The new data examined in this paper shows that, along with 
variations in form, strong and weak definites may also show some variations in 
meaning. For instance, in Icelandic, a strong article might be used for first time 
anaphoric references, but then in subsequent discourse, the weak form can be 
used to pick up the same referent. Another semantic distinction relates to which 
article is chosen when a referent meets both conditions (uniqueness and famil- 
iarity) — e.g. when referring to the family dog. German might choose the strong 
article for this, while Akan apparently the weak form (no article) for the same sit- 
uation. A central question present throughout this paper is whether the patterns 
of semantic variation found across languages still fit within the strong/weak con- 
trast, as though they are different points within a continuum that has uniqueness 
and familiarity as endpoints, or if they are orthogonal to it. 

The weak vs. strong definite distinction is also the topic of three other papers 
in this volume. Carlos Cisneros's paper, "Definiteness in Cuevas Mixtec”, shows 
that this Otomanguean language has two means for marking definiteness: bare 
nouns, which are used to refer to entities that uniquely satisfy a noun's descrip- 
tion, and definite articles - derived from noun classifiers -, which are used for 
anaphoric definites. However, not all nouns resort to the same markers to formal- 
ize this distinction. Thus, according to their strategies for encoding uniqueness 
or familiarity, the author recognizes three types of nouns: (a) those that express 
uniqueness with a bare nominal and anaphoricity with the classifier-like article; 
(b) those that use overt marking for both types of definiteness (“irregular nom- 
inals”); and (c) those which cannot combine with definite articles at all. Nouns 
in the (b) type are usually animate, so animacy seems to drive the patterns by 
which nouns select their definiteness markers. The paper contributes to the dis- 
cussion put forth by Schwarz's work by underlining the possibility of variation 
between different types of definiteness-marking strategies, not only across lan- 
guages, but within a single language, likely driven by lexical classes (particularly 
by animacy features). Also, it brings up the topic of what formal devices are 
involved in marking definiteness. While definiteness markers are commonly re- 
lated to demonstratives or other types of determiners, little has been said about 
their relation with other syntactic categories, like nominal classifiers - in Mix- 
tec -, or adjectives, as in Lithuanian, a phenomenon discussed in Sereikaite’s 
work. 
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Milena Sereikaité's paper "Strong vs. weak definites: Evidence from Lithua- 
nian adjectives" presents an analysis of the contrast between long and short ad- 
jectives in Lithuanian. As the author shows, in Lithuanian - a language without 
articles — definiteness can be encoded in a system of two forms of adjectives that 
mirrors the strong/weak distinction for definite descriptions: the long adjective 
form, marked with the morpheme -ji(s), behaves like a strong article, while the 
bare form, in addition to being indefinite, is licensed by uniqueness of reference, 
and thus semantically resembles weak definite articles. More precisely, by exam- 
ining the behavior of nouns with long and short adjectives in different contexts, 
the author shows that long adjectives are felicitous in anaphoric uses with iden- 
tical and not identical antecedents, while the bare form of adjectives is not only 
compatible with indefinite contexts — such as existentials and the introduction 
of new referents into discourse -, but, crucially, bare adjectives can also trigger a 
definite reading in contexts that require uniqueness, such as larger situation uses 
and part-whole bridging. In sum, Sereikaité's chapter provides further support 
for the distinction between strong versus weak definites, and underlines the fact 
that this distinction is not necessarily encoded in determiners or bare nouns. 

The third language-specific study in this volume directly based on Schwarz's 
strong/weak distinction for definite descriptions is Ava Irani’s “On (in)definite 
expressions in American Sign Language”, which inquires on the nature of the 
pointing sign 1x and concludes that, contrary to what previous studies had pro- 
posed (Koulidobrova & Lillo-Martin 2016), it does not correspond to a demonstra- 
tive. The claim is based on the fact that 1x is not compatible with two contexts 
in which demonstratives are expected to appear: it does not allow contrastive 
readings, and it cannot point out to salient out-of-the-blue referents in a neu- 
tral location. Therefore, Irani argues that when a NP referring to a previously 
established locus follows 1x, it behaves as a strong definite article: it can be used 
in anaphora, and in producer-product bridging. By contrast, weak definite de- 
scriptions are expressed with bare NPs, similarly to what has been observed in 
classifier languages (as in Cisneros's work in this volume). In ASL, Irani argues, 
both bare NPs and ıx+NPs can be definite or indefinite, depending on the speci- 
fication of a locus feature, which, according to the author, suggests that in ASL 
definiteness is not semantically encoded. In conclusion, Irani's work sums more 
evidence to the growing body of data showing that, at least for some languages, 
standard semantic approaches to definiteness such as familiarity and uniqueness, 
might not be sufficient to explain how a given NP gets it definite or its indefinite 
interpretation. 
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Another language-specific study included in this volume is Maurice Pico's con- 
tribution ^A nascent definiteness marker in Yokot'an Maya", which discusses the 
meaning of the particle ni, a reduction of the distal demonstrative jini in this 
Mayan language. In the previous literature, the particle ni has been treated as a 
definite determiner, despite the fact that neither uniqueness or familiarity seem 
to be natural choices to account for the motivation behind its use. To better un- 
derstand the presence of ni, Pico carries out a detailed text analysis in terms of 
Centering Theory, a framework specialized in modeling the way in which the 
changing salience of referring expressions helps to manage attention and atten- 
tion shifts throughout the discourse progression. From this analysis, Pico con- 
cludes that ni is an attentional transition marker, that is, an indicator of change 
in the discourse status of the entity evoked by an NP, and it is thus particularly 
used to perform topicality shifts. This proposal accounts for the different uses of 
ni, for its low frequency and relative optionality, and for its co-presence with the 
topic marker ba. Furthermore, the proposal is compatible with the early stage 
of grammaticalization at which the particle should stand according to the gram- 
maticalization paths proposed in the literature for the development of definite 
articles from demonstratives (Greenberg 1978; Hawkins 2004). 

The next paper in the volume explores the meaning relations between mem- 
bers of different article systems. In "Definiteness across languages and in L2 ac- 
quisition”, Bert Le Bruyn claims that languages with no articles are not all equal, 
and their subjacent differences come to light when their speakers acquire En- 
glish as a second language. According to a previous study by Ionin et al. (2004), 
speakers of Korean, Russian and Japanese as L1 overproduce definite articles in 
English when referring to specific entities, that is, to referents that are familiar 
and salient for speakers, but unknown to the hearer. Thus, overproduction of 
definite articles by speakers of these languages is seemingly triggered by this 
particular type of specificity. These results are interpreted as though speakers of 
such languages "fluctuate" between two types of definite article systems: in one 
system (like English), definite articles are used for definite reference, irrespective 
of specificity. In other systems, like Samoan, definite articles are used for specific 
reference, whether definite or indefinite, as well as foe non-specific definites. The 
explanation thus provided for the overproduction of definite articles under speci- 
ficity conditions is called "the Fluctuation Hypothesis". Le Bruyn shows that L1 
speakers of Mandarin, however, do not comply with the predictions of the Fluc- 
tuation Hypothesis. Speakers did not produce definite articles for specific indef- 
inites more than they did for the non-specific ones. Therefore, their choice did 
not seem to be driven by specificity, at least not the type of specificity tested 
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by the previous study. The author designed a second test in which specificity 
was reflected on the referent being foregrounded and noteworthy (but, crucially, 
not unique or familiar), while non-specific referents were deemed such for their 
being backgrounded and not noteworthy. This contrast revealed that, when over- 
producing definite articles, Mandarin L1 speakers were more likely to use them 
for non-specific (backgrounded) referents than for foregrounded (i.e. specific) 
referents. The findings point to the need for designing a research program that 
compares multiple L1 and their whole definiteness marking resources in order to 
respond to the question of how L1 influences L2 acquisition. 

The next three papers focus on determining the syntactic locus of definite- 
ness markers and on assessing the relation between definiteness marking and 
other projections in the nominal domain. "Licensing D in classifier languages 
and “numeral blocking"" by David Hall deals with definiteness in numeral classi- 
fier languages. The paper proposes an alternative analysis to standard accounts 
of definiteness in this type of systems (Cheng & Sybesma 1999; Simpson 2005). In 
Wenzhou Wu and Weining Ahmao, bare classifier phrases can express definite- 
ness, but the definite interpretation is blocked under the presence of a numeral. 
The standard explanation for this fact is that the classifier may express definite- 
ness if it moves up to a Determiner head, but the presence of a numeral in the 
Specifier of an intervening Number head blocks this movement (Simpson 2005). 
By contrast, the proposal put forth by Hall argues that in this language there are 
two separate syntactic structures for CI-N and #-CI-N. phrases in this language. 
Crucially, in the later case where the numeral is required, the numeral and the 
classifier form a constituent, to the exclusion of the noun. In sum, Hall's paper 
aims to contribute to a better understanding of the relation between the interac- 
tion of functional heads in the nominal domain and definiteness, specifically, in 
numeral classifier languages. 

The second paper addressing the interaction of nominal functional projections 
in the expression of definiteness is Miloje Despi¢’s contribution, “On kinds and 
anaphoricity in languages without definite articles". This paper studies the avail- 
ability of anaphoric readings for bare nouns in languages that do not have defi- 
nite articles, specifically, Serbian, Turkish, Japanese, Mandarin, and Hindi. Some 
of these languages have number marking and others do not. Following the pro- 
posal that these languages do not project DPs (Baker 2003; Bošković 2008; Boš- 
kovié & Gajewski 2011; Despić 2011; 2013; 2015), their anaphoric interpretations 
represent a theoretical problem, since it is standardly assumed that DP is the pro- 
jection responsible for anaphoric readings, as it happens with the English exam- 
ple I have an apple and a pear. I gave you the apple. This suggests that there must 
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be some other mechanism for anaphoricity. The main empirical contribution of 
the paper is a typology of interpretations for bare nouns in the studied languages, 
which highlights the correlation between the presence of number marking and 
the availability of anaphoric readings in bare nouns that refer to kinds, while 
its explanatory import is to account for all these possibilities based on Dayal's 
(2004) system of type-shifting operations. The proposal, in a nutshell, is that kind- 
referring noun phrases can only obtain anaphoric readings in languages with 
number marking and that this is due to the fact that these languages derive kind 
reference by means of a mechanism that introduces the ı type-shifter and enables 
definiteness. 

Another contribution dealing with the syntax and semantics of kind-referring 
bare nouns is Olga Borik & María Teresa Espinal's paper, "Definiteness in Rus- 
sian bare nominal kinds". According to the authors, Russian bare singular nouns 
in argument position with kind-level predicates are interpreted as definite kinds. 
The general hypothesis is that definite kinds, even in a language without articles 
such as Russian, encode definiteness semantically and syntactically. In the case 
of Russian, definiteness is provided by a null D interpreted as ı. In the spirit of 
emphasizing the dialogue between theory and data, the authors provide inde- 
pendent empirical semantic and syntactic data to support their claims. Thus, in 
order to demonstrate that Russian bare singular nouns are interpreted as defi- 
nites, Borik & Espinal show that they are acceptable in kind-level predicates of 
the "extinct"-type. Given that these contexts require their subject to be definite, 
it follows that, semantically, Russian bare singulars are definites. As for the syn- 
tactic evidence for a null D, the authors compare the behavior of bare plurals 
with kind reference and small nominals (which are arguably not DPs) in some of 
the contexts analyzed in Pereltsvaig (2006) - i.e. control of PRO, the possibility 
of being antecedents of reflexive pronouns, pronominal substitution, and the dis- 
tribution of relative clauses - to show that Russian bare singulars behave as one 
would expect from a DP. In conclusion, Borik & Espinal's paper deals with two of 
the subjects that has long interested linguist working on definiteness: reference 
to kinds and its links to definiteness and the locus of definiteness in article-less 
languages. 

The last four papers in the volume focus on non-canonical uses of definite 
noun phrases. The next two contributions deal with so-called “weak definites", an 
interpretation of definite descriptions that does not comply with the requirement 
of referring to a unique or familiar entity. Adina Williams's chapter, "A morpho- 
semantic account of weak definites and bare institutional singulars in English", 
analyzes English weak definites (like in going to the store) and bare institutional 
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singulars (BIS; like in going to school), which are analogous in meaning and dis- 
tribution and in this respect differ from regular definites (like in going to the 
castle), which the author calls strong definites.! The main concern of the study 
is the role that NumP plays in their interpretation, along with the denotation 
of their head noun. The author provides a morpho-semantic account of the phe- 
nomenon, according to which the particular behavior of these constructions is a 
consequence of the lexical nature of their head noun. Williams recognizes three 
lexical classes of nominal roots, each of them with different capacities regarding 
the weak/strong distinction: (i) strong-only roots, which are of type <n, <e, t5), 
have a count interpretation and can combine with NumP and with a regular, 
strong, definite determiner; (ii) strong-weak ambiguous roots, which can be of 
type <n, <e, t)), are countable and combine with NumP and with a regular deter- 
miner, or, alternatively, are of type <e, t), not number specific, and may combine 
with a weak determiner; (iii) BIS roots, which can be of type <n, e, t)? and be- 
have as class (i), or of type <k, t}, in which case they are incompatible with a 
determiner but can semantically incorporate. The syntactic consequence of the 
lexical differences between regular and weak definites and bare institutional sin- 
gulars is that, whereas the first type projects both NumP and DP, the second type 
projects only DP, and the third type does not project either of them. As a seman- 
tic consequence, there are three different types of compositional derivations of 
definite noun phrases: one for regular definites, one for weak definites and one 
for bare institutional singulars. 

The second paper devoted to weak definites is "Is the weak definite a generic? 
An experimental investigation", a paper coauthored by Thaís de Sá, Greg N. Carl- 
son, Maria Luiza Cunha Lima and Michael K. Tanenhaus. The authors present 
data from a corpus study and four experiments aiming to examine the differ- 
ent interpretative properties of weak definites in comparison with regular and 
generic definites. This comparison turns relevant given that some of the existing 
semantic accounts of weak definites, in particular, Aguilar-Guevara & Zwarts 
(2011; 2013), assume that they are completely different from regular definites and 
closer to generic definites. The results of the studies offered in this paper show 
that weak definites do not behave as regular strong definites nor as generic defi- 
nites (like in The hospital is not my favorite place). The corpus study revealed that 
weak definites and generics are not in complementary distribution in any of the 
syntactic environments in which they appear. Moreover, the majority of weak 
definites occurred in clauses with activity and telic predicates, while generic def- 
inites occurred more in clauses with stative and activity predicates. Experiment 
1 showed that, whereas regular definites were judged as denoting an individual, 


‘Notice that this means that the weak/strong distinction Williams refers to is not the same one 
adopted by Schwarz (2009). 
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generic definites were judged to be about a category, and in this respect, weak 
definites behaved more similarly to the former than to the latter. Experiment 
2 attested regular definites licensing more continuations containing corefering 
anaphoric noun phrases than generic definites, which encourage more interpre- 
tations introducing new events; in this respect, weak definites again showed 
more similarity with regular definites than with generic definites. Experiment 
3 revealed analogous results in a free completion task. Finally, Experiment 4 re- 
quired participants to repeat the target noun phrases in their completions; the 
completions triggered by each condition suggest that generics behave differently 
from both regular and weak definites. 

Just as weak definites deviate from the canonical semantic reference of defi- 
nite descriptions, definite determiners also occur in constructions where a simple 
account based on familiarity or uniqueness is not sufficient. One of these non- 
canonical type of definiteness is the one observed in superlative constructions 
composed of a definite marker plus a comparative one, like in Este libro es el más 
interesante (literally, "This book is the more interesting") in Spanish. In their 
chapter, "Most vs. the most in languages where the more means most", Elizabeth 
Coppock and Linnea Strand study the expression of superlativity in French, Span- 
ish, Italian, Romanian, and Greek, in the illustrated construction is allowed. The 
authors provide a classification of superlative constructions based on a number 
of distributional and interpretative criteria, such as prenominal vs. postnominal 
position, adjectival vs. adverbial domain, qualitative vs. quantitative reading, ab- 
solute vs. relative reading, and relative vs. proportional reading. Among the dif- 
ferent subtypes of constructions, the presence/absence of definiteness markers 
varies from language to language. The chapter makes two explanatory contribu- 
tions. First, it argues that the variety of patterns found in the studied languages 
regarding the presence/absence of a definite marker is due to the interaction of 
two competing pressures within the grammar. One of them is the pressure to 
mark uniqueness overtly. The other is the pressure to avoid combining a definite 
determiner with a predicate of entities other than individuals, such as events or 
degrees. In conjunction with some assumptions regarding the semantics of var- 
ious types of superlatives, these pressures result in a disinclination for certain 
patterns. The second explanatory takeaway of this chapter is a compositional 
analysis of the described superlative constructions, based on standard and in 
more recent mechanisms proposed in formal semantics (Functional Application, 
Definite Null Instantiation, and Measure Identification). 

The volume closes with another study of a non-canonical use of definite de- 
terminers. Urtzi Etxeberria and Anastasia Giannakidou's paper, "Definiteness, 
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partitivity and domain restriction: A fresh look at definite reduplication” tries to 
find a link between two phenomena that up to now had been considered indepen- 
dent: definite reduplication in Greek and overt domain restriction in quantifier 
phrases in Basque, Greek, Bulgarian and Hungarian. Based on judgments about 
the interpretation of doubly-marked definites (like the fact that they are infelici- 
tous when only one entity in the context satisfies the predicate provided by the 
adjective) they argue that Greek definite reduplication has a partitive-like inter- 
pretation, and thus, the second definite marker (the one that precedes the adjec- 
tive) is in fact a domain restrictor. The paper thus explores the possibility that D 
performs two different types of functions cross-linguistically: a saturating and a 
non-saturating type. Saturating D yields e-type expressions after combining with 
a predicate <e, t». That is the common case of definiteness markers, like the ones 
that have been discussed through most of the papers in this volume, where the re- 
sulting DP refers to a unique, salient or familiar individual. The non-saturating D, 
in contrast, combines with a given expression only to yield another expression of 
the same semantic type. If it combines with a predicate, as in Greek polydefinites, 
it yields a predicate-like expression (as in Greek definite reduplication), and if it 
combines with a generalized quantifier, it yields a domain-restricted quantifier, 
as in quantifier expressions in the languages analyzed. 
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Chapter 1 


Weak vs. strong definite articles: 
Meaning and form across languages 


Florian Schwarz 


University of Pennsylvania 


One line of recent work on definite articles has been concerned with languages 
that utilize different forms for definite descriptions of different types. In the first 
part of this paper, I discuss the semantic analysis of the underlying distinction of 
weak and strong definite articles as proposed in Schwarz (2009), which formalizes 
the contrast in terms of uniqueness (for weak articles) vs. anaphoricity (for strong 
articles). I also review the empirical motivation for the analysis based on German 
preposition-determiner contraction and its implications for related semantic phe- 
nomena. The second part of the paper surveys recent advances in documenting 
contrasts between definites in various other languages. One issue here will be on 
assessing to what extent the cross-linguistic contrasts are uniform in terms of their 
semantics and pragmatics, and to what extent there is variation in the relevant 
patterns. A second issue is to evaluate how the obvious variation in the formal 
realization of the contrast across languages can contribute to a more refined imple- 
mentation of the contrast in meaning. 


1 Introduction 


Definite descriptions have playeda centralrole in the study of meaning in natural 
language right from the start, going back to early work by Frege (1892), and lead- 
ing to the famous debate in the philosophy of language between Russell (1905) 
and Strawson (1950), with continued interest in related issues (for an extensive 
collection, see Reimer & Bezuidenhout 2004) . One central reason for this would 
seem to be that they offer a particularly insightful perspective on how (at least 
potentially) different dimensions of meaning differ from one another and inter- 
act, as well as on the role of context in interpreting linguistic utterances. Work in 
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linguistics has also been concerned with similar issues, specifically with regards 
to related questions about the interplay of contextual information and grammat- 
ical representations, in particular concerning mechanisms for quantificational 
co-variation, starting most prominently with Heim (1982).! 

One line of work on definite articles that has gained prominence in recent 
years has been concerned with languages that utilize different forms for definite 
descriptions of different types. While there is a fairly rich tradition in the more 
descriptive literature, especially on German dialects, going back at least to Hein- 
richs (1954), the notion that languages might have more than one type of definite 
article (beyond mere inflectional variations), with different semantic-pragmatic 
profiles, only received more wide-spread attention in the formal semantics lit- 
erature in the 2000s. The present paper begins with a review of the analytical 
approach proposed in Schwarz (2009). It characterizes the distinction between 
weak and strong definite articles as in terms of uniqueness (for weak articles) vs. 
anaphoricity (for strong articles). The formal analysis is empirically motivated 
by data on German preposition-determiner contraction, and I briefly discuss the 
main data points in its favor, as well as its implications for related semantic phe- 
nomena. 

The second part of the paper surveys recent advances in documenting con- 
trasts between definites in various other languages. One focus here will be on 
assessing to what extent the cross-linguistic contrasts are uniform in terms of 
their semantics and pragmatics, and to what extent there is variation in the rel- 
evant patterns. A second focus is to evaluate how the obvious variation in the 
formal realization of the contrast across languages can contribute to a more re- 
fined implementation of the contrast in meaning, and how this relates to noun 
phrase structure more generally. While a fair amount of the cross-linguistic data 
supports the analytical contrast in terms ofthe weak vs. strong article distinction, 
there certainly is variation in definite contrasts beyond that. I briefly discuss one 
alternative family of proposals for capturing such variation from the literature, 
and also sketch some tentative analyses of additional points of variation. 

Before moving on, let me issue a few caveats concerning the limitations in 
scope of the present inquiry. First of all, I start from the theoretical distinction I 
proposed in earlier work, and explore how it fares with regards to a set of cross- 
linguistic data that considers relevant phenomena and contrasts. This should not 
be taken to suggest that other theoretical approaches, beyond the ones consid- 
ered here, have no role to play in the analysis of definite descriptions. Rather, 


!For a comprehensive recent proposal from the perspective of situation semantics, see Elbourne 
(2013). 
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it is simply a decision grounded in a theory-driven approach to empirical data, 
within which it makes sense to explore to what extent a particular analysis can 
deal with empirical facts. Relatedly, a core part of the proposal under considera- 
tion, as things stand, is that it makes a binary distinction. This may well turn out 
to be too limited, as further levels of distinction are likely to be relevant to capture 
all the data. Another aspect of the theoretical approach is that it takes notion(s) 
of definiteness developed on the basis of familiar languages such as English and 
German to analyze a variety of other languages. That may well come with its own 
pitfalls, but we have to start somewhere, and re-evaluate later to what extent 
those notions are suitable for spelling out the broader cross-linguistic picture. 
Finally, I limit my attention here to the form and meaning of definite descrip- 
tions alone, without consideration of indefinites. This, too, may be problematic 
in the long term, as at least some key effects in a given language may relate to the 
system of definite and indefinite expressions it has at its disposal. These caveats 
notwithstanding, I hope that the following contributes to our understanding of 
the typology of definiteness by evaluating a detailed formal proposal in light of 
a broader range of cross-linguistic data. 


2 Two types of definite articles 


2.1 Two semantic perspectives on definite descriptions 


Broadly speaking, there are two families of approaches to analyzing definite 
descriptions that have been predominant in the formal literature, namely ones 
based on the notion of uniqueness, on the one hand, and ones based on the no- 
tion of familiarity or anaphoricity on the other hand. I provide a sketch of each 
of these here, following the bulk of the literature in seeing them as comprehen- 
sive proposals that aim to capture all data on definite descriptions, as is desirable 
for reasons of theoretical parsimony (see below for some pointers to mixed ap- 
proaches in the literature). 

Starting with uniqueness-based approaches, the intuitive motivation is based 
on examples such as the following: 


(1) Context: Speaker is standing in an office with exactly one table. 
The table is covered with books. 


The central idea here is that definite descriptions pick out an individual that 
uniquely fits the provided description. Formally speaking, the analysis is usually 
cast in terms of a definite description of the form the NP encoding that a) there is 
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an entity in the extension of NP (the existence condition) and b) that the number 
of such entities not exceed one (the uniqueness condition). This is at the core 
of both the traditions following Russell and Frege/Strawson, though they differ 
in the status they accord these conditions. But they agree that in the end, refer- 
ence is effectively established via uniqueness (though note that they need not see 
the definite description itself as directly referential; Russell sees it as quantifica- 
tional), so that the individual that gets talked about is precisely the one uniquely 
satisfying the nominal description. 

For present purposes, a key point to note right away is that any analysis 
grounded in uniqueness faces an obvious challenge - namely that, taking (1) 
as our example, there are many tables in the world. The standard remedy, ex- 
tensively spelled out by Neale (1990), is to appeal to a general mechanism of 
domain restriction, which has to be assumed independently for other kinds of 
noun phrases (and likely for other constructions as well). While the general idea 
of - and need for - such a mechanism is fairly straightforward and intuitive, its 
technical implementation is not, though we will not get into further detail here 
for reasons of space.? 

One standard type of definite usage that constitutes a challenge for uniqueness- 
based approaches is one involving a preceding indefinite that introduces the in- 
tended referent of the definite: 


(2 a. Igota table and an armchair delivered to my office. 
b. The table is already covered with books. 


Crucially, and unlike (1) above, this example is perfectly compatible with there 
being another table in the office, which both the speaker and the addressee are 
aware of. The challenge for a uniqueness-based account of domain restriction 
then is to formulate the general purpose domain restriction machinery in such 
a way that the previous mention of the indefinite can bring it about that the 
domain only includes the newly delivered table, i.e. does not include everything 
in the office, even though we may very well be talking about the office as a whole 
in the larger conversation. 

Examples like (2) constitute the core intuitive motivation for the second main 
approach to definite descriptions in the formal literature. It sees definites as func- 
tioning in a way rather parallel to pronouns (in a traditional view), and goes back 
to Christophersen (1939). The highly influential, and first fully fleshed out mod- 
ern account along these lines comes from Heim (1982) (with a similar perspective 


*For influential proposals, see, e.g. Westerstähl (1984), von Fintel (1994), Stanley & Szabó (2000), 
Elbourne (2013). 
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offered by Kamp 1981), who proposes that definite descriptions come with an in- 
dex, which has to be one that is already established, or familiar, in the discourse. 
The job of indefinites, in contrast, is to introduce new indices to the discourse, 
yielding a straightforward account of (2) as involving the establishment of an in- 
dex mapped onto the newly delivered table in (2a), which is then anaphorically 
picked up by the definite in (2b). 

As may be obvious by now, the initial example in (1) in turn constitutes a 
challenge for accounts based on familiarity, as there is no previous mention of 
the table there. The standard approach for tackling this challenge is to detach the 
notion of familiarity from the presence of a linguistic antecedent, e.g. by allowing 
entities physically present in the utterance context to count as familiar as well.? 
This needs to be further extended, however, to deal with cases of so-called "global 
uniques”, such as the sun or the pope. 

Rather than diving further into the intricacies of how each of the two ac- 
counts sketched above can deal with various challenging cases, we now turn 
to another perspective, which bites the bullet and admits that both analyses ade- 
quately capture how parts of natural language work. While this may seem, from 
an a priori perspective committed to theoretical parsimony, like admitting de- 
feat, such an approach gains empirical motivation once languages that explicitly 
differentiate between different types of definite articles are considered. This is 
precisely the perspective put forward in Schwarz (2009), with a detailed empir- 
ical discussion of variation in contraction of definite articles and prepositions. 
The central argument is that certain forms (namely the contracted ones) behave 
exactly as expected from a uniqueness-based approach, whereas others (the non- 
contracted ones) exhibit the behavior we would expect from an approach that 
sees definites as anaphoric. To the extent that parallel patterns are found across 
other languages, the general empirical case for a richer theoretical inventory gets 
strengthened further, and one central aim of the present paper is to survey the 
evidence from a variety of other languages in this regard. In addition, the richer 
theoretical tool-box can also be put to use to deal with some of the complexities 
in languages without any obvious contrast between different definite articles, 
such as English, though that part of the story will not be pursued here, and it 
remains to be seen just how the English facts should be captured in light of this 


perspective.? 


3For extensive discussion of the pertinent distinction between weak and strong familiarity, see 
Roberts (2003). 

^For previous discussion of English data going beyond what can be captured using just one of 
the two approaches above, see, a.o. Birner & Ward (1994), Poesio & Vieira (1998). 
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2.2 Distinctions between definite articles in German and Germanic 
dialects 


Much early descriptive work on contrasts between definite articles focused on 
German and Germanic dialects.? The first detailed discussion of Germanic di- 
alects with two forms for definite articles that I am aware of dates back to 
Heinrichs (1954), who discusses dialects of the Rhineland (see also Hartmann 
1967). Other dialects for which this phenomenon has been described include the 
Mónchengladbach dialect (Hartmann 1982), the Cologne dialect (Himmelmann 
1997), Bavarian (Scheutz 1988; Schwager 2007) and Austro-Bavarian (Brugger & 
Prinzhorn 1996; Wiltschko 2013), Viennese (Schuster & Schikola 1984), Hessian 
(Schmitt 2006), and, perhaps the best documented case, the Frisian dialect of 
Fering (Ebert 1971a,b).° A parallel phenomenon also exists in Standard German, 
although here the contrast is only present in particular morphological environ- 
ments (Hartmann 1978; 1980; Haberland 1985; Cieschinger 2006; Puig Waldmüller 
2008; Schwarz 2009). I will begin with some brief illustrations from Fering as a 
well-documented case with two fully distinct paradigms for definite articles, and 
then introduce the basic contrast in Standard German. Somewhat more subtle 
German data will be discussed in the following section to flesh out the nature of 
the contrast in meaning between the different articles. 

The basic paradigm for what Ebert (1971b) calls the A-article and the D-article 
is presented in Table 1. The examples in (3) illustrate the contrast between the 
two. 


Table 1: The definite article paradigms in Fering (Ebert 1971b: 159) 


m.Sg. fSg nSg. Pl. 


A-article a at at d 
D-article di det det dón 


(3) Fering (Ebert 1971b: 161) 


a. Ikskal deel tua / *di kuupmaan. 
I must down to theweak / thestrong grocer 


‘I have to go down to the grocer? 


?Parts of this section are adapted from Schwarz (2013). 
Leu (2008) discusses related matters in Swiss German, although he focuses on syntactic issues. 
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b. Oki hee an hingst keeft. “A / Di hingst haaltet. 
Okihasa horse bought theyeak / thestrong horse limps 


‘Oki has bought a horse. The horse limps’ 


A parallel contrast can be observed in Standard German, where certain combi- 
nations of prepositions and definite determiners can, but do not have to, contract 
(see, among others, Hartmann 1978; Haberland 1985; Cieschinger 2006). 


(4) German (Schwarz 2009: 7) 
a. Hans ging zum Haus. 
Hans went to the,,4 house 
‘Hans went to the house: 
b. Hans ging zu dem Haus. 
Hans went to thestrong house 


‘Hans went to the house’ 


Descriptively, the two forms seem to correspond straightforwardly to the two 
distinct definite articles in Fering, and I will assume in what follows that contrac- 
tion reflects which article form is at play.’ Table 2 introduces the terminology I 
use to refer to the different forms, with the weak article corresponding to Ebert’s 
A-article and the strong one to her D-article.? 


Table 2: Terminology for the German article forms 


Form Article type Gloss 


zum weak P_theweak 
zudem strong P_thestrong 


7A word of caution is in order concerning variation in contraction: some contractions are more 
colloquial than others, and there are corresponding differences in frequencies in written texts. 
My discussion focuses on prescriptively fully recognized cases, to avoid prescriptive biases 
against contraction, but the full range of phenomena is broader, and may even extend to dif- 
ferences of phonetic realization of articles in environments where contraction is not available. 
See Schwarz (2009: §2) for further discussion. 

"The notions weak and strong have been used to group determiners in various other ways: 
Milsark (1977) used the existential construction discussed in the introduction to identify ^weak" 
determiners, while Herburger (1997) makes yet another distinction. Finally, Carlson et al. (2006) 
introduce the notion of ^weak definites" (with an earlier, related use by Poesio 1994), briefly 
discussed below. To avoid confusion, I will generally use the terms weak article and strong 
article (definites) in talking about the distinction introduced here. 
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The next section discusses the German contraction data in some detail to flesh 
out precisely what contrasts in meaning and use are associated with the two 
forms. 


2.3 The contrast in meaning between weak and strong articles 


The key concern for our purposes is to what extent the two different article forms 
differ in their meaning and conditions of use. As is the case in Fering (3), weak 
and strong article definites in German are not in free variation, but rather seem 
to be subject to different contextual constraints: 


(5) German 


In der Kabinettsitzung heute wird ein neuer Vorschlag vom 
in the cabinet meeting today is a new proposal by_theyeak 
(V Kanzler  / #Minister} erwartet. 
chancellor / minister expected 
'In today's cabinet meeting, a new proposal by the chancellor/minister 
is expected. 


The minimal contrast in availability of the weak article, based on whether the 
noun is Kanzler (‘chancellor’) or Minister (‘minister’) illustrates that the weak 
article requires uniqueness: in a given cabinet meeting, there is only one chan- 
cellor, but several ministers, thus unique reference can only be successful for the 
former. In contrast, the strong article does not seem to benefit similarly from 
contextual uniqueness: 


(6) German 
# In der Kabinettsitzung heute wird ein neuer Vorschlag von dem 
in the cabinet meeting today is a new proposal by thegtrong 
Kanzler erwartet. 
chancellor expected 
'In today's cabinet meeting, a new proposal by the chancellor is 
expected: 


Without further context, it is not available to refer to a minister, either, but 
as soon as one minister has been introduced explicitly in prior discourse, this 
becomes perfectly straightforward: 
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(7) German 
a. Hans hat gestern einen Minister interviewt. 
Hans has yesterday a minister interviewed 


‘Hans interviewed a minister yesterday. 
b. V In der Kabinettsitzung heute wird ein neuer Vorschlag von 
in the cabinet meeting today is a new proposal by 
dem Minister erwartet. 
thestrong minister expected 
'In today's cabinet meeting, a new proposal by the minister is 
expected: 


Yet another example driving home the contrast between weak and strong ar- 
ticles is provided in (8): 


(8) German (Schwarz 2009: 30) 
In der New Yorker Bibliothek gibt es ein Buch über Topinambur. 
in the New York library exists ExPLa book about topinambur 
Neulich war ich dort und habe Sim / in dem Buch nach 
recently wasI there and have in-theyeak / in thestrong book for 


einer Antwort auf die Frage gesucht, ob man Topinambur grillen 
an answer to the question searched whether one topinambur grill 
kann. 

can 


'In the New York public library, there is a book about topinambur. 
Recently, I was there and searched in the book for an answer to the 
question of whether one can grill topinambur: 


Taken together, these facts suggest that uniqueness is neither necessary or 
sufficient for reference with the strong article. Instead, it seems to require an 
antecedent, here the indefinite, to refer to anaphorically. The two articles thus 
differ in the way they relate to their context, and they do so in a way that seems 
to line up rather naturally with the two main theoretical approaches to definites. 

Consideration of further cases, which have been extensively discussed in the 
literature, extends this perspective in interesting ways. So-called bridging uses 
(Clark 1975; Hawkins 1978; Prince 1981) involve definites that seem to relate back 
to the preceding context in more indirect ways. 
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(9 a. John was driving down the street. 
b. The steering wheel was cold. 


(10) a. John bought a book today. 
b. The author is French. 


The steering wheel in (9) is of course understood as belonging to the car in- 
volved in the driving event in the first sentence. Similarly, the author in (10) is 
understood to be the one who authored the previously mentioned book. But how 
should these relations to the preceding context be seen theoretically? As it turns 
out, the German articles differentiate between these two standard cases in a the- 
oretically interesting way, such that the weak article is used in the former case, 
but the strong article in the latter. 


(11) German (Schwarz 2009: 52-53) 

a. Part-whole relation 
Der Kühlschrank war so groß, dass der Kürbis problemlos 
the fridge was so big that the pumpkin without a problem 
im / tin dem Gemüsefach untergebracht werden 
in_theweak / in thestrong crisper stowed be 
konnte. 
could 
"Ihe fridge was so big that the pumpkin could easily be stowed in the 
crisper. 

b. Producer relation 
Das Theaterstück missfiel ` dem Kritiker so sehr, dass er in seiner 


the play displeased the critic so much that he in his 
Besprechung kein gutes Haar Som / an dem Autor ließ. 
review no goodhair on_theweak / on thestrong author left 


"Ihe play displeased the critic so much that he tore the author to 
pieces in his review: 


The first example is entirely unsurprising if we assume that the weak article re- 
quires uniqueness (plus a suitable mechanism for domain restriction, as needed 
for any uniqueness-based account), assuming that there is a unique crisper in 
the mentioned fridge. The second case is more interesting, and arguably informs 
just what mechanisms are at play in relating the interpretation of definites to 
the context. Taking the above illustrations of the role of anaphoricity for strong 
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article definites seriously, the most straightforward analysis here is that the re- 
lational noun can have its relatum slot filled by an anaphoric index, which links 
the author directly back to the aforementioned book. 

Looking beyond simple referential cases, it is well known that definites can 
also receive co-varying interpretations in quantificational contexts. Interestingly, 
both types of bridging examples (as well as ones parallelto the simple unique and 
anaphoric examples above) generalize to such environments: 


(12) German 
a. Jeder Student, der ein Auto parkte, brachte einen Parkschein 
every student thata car parked attached a parking-pass 
am / tan dem Rückspiegel an. 
on theweak / on thestrong rear view mirror PART 
‘Every student that parked a car attached a parking pass to the 
rearview mirror. 


b. Jeder, der einen Roman gekauft hat, hatte schon einmal eine 
everyone that a novel bought has had already once a 
Kurzgeschichte #vom / von dem Autor gelesen. 


short story by_theweak /by thestrong author read 


'Everyone that bought a novel had already once read a short story by 
the author. 


This is of substantial theoretical importance, as the analysis of co-variation 
under quantifiers is at the core ofthe interaction between contextual information 
and grammatical machinery. Thus, any analysis of the contrast between definite 
article forms must be rich enough to extend to a broader framework that can 
account for co-variation. A simple story in terms of purely pragmatic constraints 
on reference and contexts of use that is not tied into these more intricate aspects 
of grammar would thus fall short. 


2.4 Sketch of the analysis in Schwarz (2009) 


The core of the analysis of the two types of definites in Schwarz (2009) is that 
weak article definites are referential expressions (of type e) that presuppose that 
there is a unique entity meeting the description of the noun phrase (in the tra- 
dition of Frege and Strawson). In contrast, strong article definites involve an ad- 
ditional anaphoric component, captured by a (pronoun-like) index introduced as 
a syntactic argument of the strong article. The analysis is couched in a broader 
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framework to capture the bridging data, as well as the interplay of context and 
grammatical mechanisms behind co-variation in different ways for the two cases. 
Starting with the weak article, the analysis assumes that a syntactically repre- 
sented situation pronoun is an argument of the determiner, which provides the 
means for ensuring an appropriate domain restriction relative to which unique- 
ness holds.? Semantically, the weak article denotes a function that takes a situ- 
ation and a property as arguments, and returns the unique entity that has the 
property in that situation, if there is one (else, its denotation is undefined). 


(13) a. [pp [theweak s] NP] 
b. [theweaxl]® z JA sity ix [P(x)s)] 


The value ofthe situation pronoun is essentially determined in the same way as 
that of regular pronouns: it can receive its value from the assignment function, 
which captures the case where definites are interpreted independently of the 
situation relative to which the sentence as a whole is interpreted (i.e. relative to 
a resource situation, following the terminology of von Fintel 1994). Alternatively, 
it can be bound, either in such a way that it is identified with the topic situation 
(that the sentence as a whole is about), or by a quantificational expression, in 
which case the denotation of the definite as a whole co-varies with the situations 
quantified over. 

The strong article minimally differs from the weak article in that it takes an 
additional individual (type e) argument, which is syntactically introduced by an 
index (that is semantically equivalent to a pronoun). The referent of the definite 
as a whole is identified with the value of this index (with the exception of bridging 
cases, discussed below). 


(14) a. [ppi [[thestrong s] NP]] 
b. Ithestrong]® = ÀS APee si À yx [P(x)(s) &x- yl 


The additional index argument of the strong article essentially introduces a 
familiarity constraint, as the context has to provide a value for the index via the 
assignment function. A preceding indefinite is one standard way for ensuring 
that, though other options may exist as well. While the issue of just how a ref- 
erent for a strong article definite can be made familiar in a suitable way in the 
context deserves more in-depth exploration (also in relation to prior discussions 


?It also accounts for the various interpretations of definites in the scope of intensional operators; 
see (Schwarz 2009) for detailed discussion. 
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of familiarity in the literature), I will limit discussion here to the former case, 
because it is easiest to control for in example contexts. 

In addition to receiving a value contextually, the index can also be bound in 
various ways, rendering co-varying readings. Fundamentally, once we subscribe 
to the above meanings for the weak and strong articles, we are committed to 
allowing for both of the standard mechanisms for introducing co-variation for 
definites, namely via binding of the situation pronoun or of the index. Yet a 
further key consequence for interpretation in context more generally is that the 
specific analysis in Schwarz (2009) leaves no role to play for domain restriction 
via C-variables (basically, pronouns for predicates; see von Fintel 1994 and Stan- 
ley & Szabó 2000). 


2.5 Some additional theoretical issues 


While the main focus of the remainder of the paper is on cross-linguistic empiri- 
cal issues, there are some further theoretical questions in relation to the analysis 
sketched above that should not go unmentioned (though the discussion below is 
hardly exhaustive in this regard). First, while the denotations in (13b) and (14b) 
are clearly related, and in fact largely overlap, this is not captured in any ex- 
planatory way as things stand - there simply are two lexical entries that happen 
to be very similar. Recent work by Grove & Hanink (2016) and Hanink (2017) 
proposes to address this issue by assuming just one definite article, with a deno- 
tation like the one in (13b), which can be compositionally extended to yield the 
strong article. In other words, the lexical variation above is instead re-analyzed as 
purely structural variation, all couched in a Distributed Morphology account of 
the contraction phenomena. This seems like a very promising avenue, though a 
few new questions also arise in light of it: first, given that this account is directly 
tied into capturing contraction, how can it be extended to languages with two 
full, independent paradigms for weak and strong articles (such as Fering)? Relat- 
edly, how does this approach integrate languages where the correlate of weak 
article definites seems to be expressed by bare nouns? Finally, some potential 
evidence in favor of multiple lexical entries for different definite articles comes 
from Grubic (2016), who presents data suggesting a separate relational strong 
article variant being in play in bridging cases. Despite these further concerns, it 
is theoretically desirable to tie together the analysis of weak and strong articles 
in a more explanatory way, so reconciling these issues with a more explanatory 
proposal should clearly be pursued in future work. 


Given the existence of so-called donkey anaphora cases with strong article definites, the latter 
furthermore requires some version of dynamic binding. 


13 


Florian Schwarz 


Another range of rather intricate issues arises in connection with relative 
clauses. It has commonly been claimed in the literature that restrictive relative 
clauses require the strong article in their head. To the extent that this holds, 
it clearly requires an explanation of the interaction between the structure and 
meaning of the article and a relative clause structure in a position that would 
standardly be assumed to feature as part of its complement NP. But complicating 
things further, various authors have pointed out additional subtleties, potentially 
involving further distinctions between types of relative clauses (see, among oth- 
ers, Cabredo Hofherr 2013; Wiltschko 2013; Simonenko 2014). While the recent 
literature (including a proposal for capturing the - likely too - simple generaliza- 
tion about restrictive relative clauses by Grove & Hanink 2016) has contributed 
real advances, this area will require substantial further attention, especially cross- 
linguistically. 


3 The weak vs. strong contrast across languages 


3.1 Key empirical and theoretical questions 


As we now turn to an overview of data from languages exhibiting similar phe- 
nomena, let us begin by stating the key empirical questions about the cross- 
linguistic data in relation to weak and strong article definites. First, we need 
to determine what other languages exhibit the same (or at least a highly simi- 
lar) contrast in their noun phrase system. Secondly, what formal means do other 
languages utilize in expressing it? Finally, to what extent do we find variation in 
terms of its semantics/pragmatics, and how does this relate to its formal expres- 
sion on the one hand and the noun phrase system of the language in question on 
the other? 

To preview the perspective laid out below, I argue that there is quite a broad set 
of unrelated languages that exhibit contrasts that can arguably be modeled in a se- 
mantically uniform way, suggesting that the underlying contrast between weak 
and strong article definites is generally available as part of the inventory that 
natural languages can draw on. Within those languages, we find a wide range of 
formal means for encoding it. Understanding this variation in form seems cru- 
cial for a satisfactory analysis of the interplay of forms and meanings involved. 
In addition to this first set of languages with an essentially uniform meaning 
contrast, other languages seem to diverge more substantially from this pattern 
in that they display different types of distinctions. One possibility is that these 
are simply revealing yet another dimension of possible variation, that is in princi- 
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ple independent of the weak vs. strong contrast. Alternatively, we can consider a 
more gradient approach to variation, that allows languages to fall into different 
places of a continuum of possible differences between types of definites. Ulti- 
mately, the key theoretical questions are how many distinctions are needed to 
account for the range of empirical variation, what is their nature (e.g. categorical 
or gradient), and - if there are multiple such distinctions - how are they related? 
We will naturally not be able to answer all these questions conclusively, but will 
discuss pertinent data in relation to these issues. 

With regards to variation in form, one way in which languages clearly differ 
is in whether they exhibit a contrast between two overt forms, or whether the 
contrast is between the presence and the absence of a given form (cf. the distinc- 
tion between Type I and Type II splits in Ortmann 2014). The former situation 
clearly holds in the Germanic dialects and in Icelandic (Ingason 2016), and possi- 
bly also in Hausa and Lakhota (for discussion and references, see Schwarz 2013). 
The latter situation seems to hold in Akan (Arkoh & Matthewson 2013), Korean 
(Cho 2016; Ahn 2016), Mauritian Creole (Wespel 2008), Czech (Simík 2015), Thai 
and Mandarin (Jenks 2015), Upper Silesian (Ortmann 2014), Upper Sorbian (Ort- 
mann 2014), Ngamo (Grubic 2016), American Sign Language (Irani & Schwarz 
2016) and Lithuanian (Sereikaité 2016). 

The following sections provide illustrative pairs of examples from a fair num- 
ber of these languages, selected to highlight cases where the contrast has been 
studied in some detail. The core phenomenon I focus on is bridging, as this is both 
in many ways the most subtle and perhaps most surprising aspect of the article 
contrast, since the data themselves in no way intuitively impose what analysis 
of definites would be the most obvious candidate. But note that at least generally 
speaking, parallel effects systematically occur for more standard anaphoric and 
unique definite uses in all these cases, so the data discussed here for illustration 
should not be taken to suggest that the relevant distinction is only made for the 


bridging cases. 


3.2 Illustrations of weak and strong article definites across languages 


The first illustration comes from Akan. Arkoh & Matthewson (2013) discuss data 
parallel to that considered in Schwarz (2009), with a contrast between bare noun 
phrases, as in (15a), which presumably is a case of bridging involving situational 


"A caveat before diving into the cross-linguistic data: not all of the languages discussed below 
have been investigated at the same level of empirical depth, and there thus may be more vari- 
ation than apparent here. But I tried to only include relatively well-documented cases that so 
far have essentially yielded complete overlap with the German contrast. 
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uniqueness, and the familiar form nó in (15b), which they argue to be a case of 
anaphoric bridging.'? 


(15) Akan (Arkoh & Matthewson 2013: 14-15) 

a. Weak 
Yè-hú-ù dàn dádáw bí ` wa ekurasihs ńkyénsidán 
1PL.sBJ-see-PAsT building old ^ INDEF at village there roof 
(4nó / #bi) ` é-hódwoów 

DEF/ INDEF PERF-Worn-out 

"We saw an old building in the village; (#the / £a (certain)) roof was 
worn out. 

b. Strong 
Äsäw nó yé-8 hín nö few ara mà 3-kyé-é 
dance DEF do-PAST chief DEF beautiful just COMP 3sG.sBJ-give-PAST 
okyiréfó nó adzi 
trainer FAM thing 


"Ihe dance was so beautiful that the chief gave the trainer a gift’ 


Similarly, Mauritian Creole, discussed by Wespel (2008), distinguishes between 
a null form (16a) and one clearly derived from the French definite article la, but 
which seems to be restricted to uses parallel to the strong article, as illustrated 
by the anaphoric ‘book-author’ bridging case in (16b). 


(16) Mauritian Creole (Wespel 2008: 155-156; source: O.M.2.8, O.M.22) 

a. Weak 
Mo fin visite enn lavil dan provins. Lameri ti pli ot ki 
I acc visit one village in province town-hall pst more high than 
legliz. 
church 
‘I visited a village in the province. The town hall was higher than the 
church: 

b. Strong 
Li fin kontanliv la ek asterli envi zwenloter la. 
she pst love book DEF and now she want meet author DEF 


‘She was fond of the book and now she wants to meet the author: 


PFor recent work offering a different perspective, which disagrees with the familiarity-based 
analysis by Arkoh & Matthewson (2013), see Bombi-Ferrer (2017). 
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American Sign Language features an expression resembling pointing within 
the signing space, which has been much discussed in the recent literature with 
regards to its pronominal uses (Schlenker 2017). However, it also serves the role 
ofa strong definite article, as illustrated by its obligatory occurence in anaphoric 
bridging in (17b). In contrast, cases involving situational uniqueness bridging, 
as in (17a), are incompatible with this form. 


(17) American Sign Language (Irani 2016) 
a. Weak 
IX, CAR, POLICE STOPPED WHY (#IX,) MIRROR BROKEN. 
"Ihe car was stopped by the police because the mirror was broken: 
b. Strong 
JOHN BUY IX, BOOK. #(IX,) AUTHOR FROM FRANCE. 
‘John bought a book. The author is from France! 


In yet another similar vein, recent discussion of Korean suggests that what 
had traditionally been considered a demonstrative - ku - seems to function as 
a familiar definite marker, while uniqueness based definites are expressed with 
bare noun phrases.!? 


(18) Korean (Cho 2016: 6) 

a. Weak 
Gyeolhonski-e gatda. Sinbu-ga /#ku sinbu-ga  paransek-ul 
wedding-to went bride-nom / that bride-nom blue-Acc 
ipeotda. 
wore 
'(I) went to a wedding. The bride / #that bride wore blue? 

b. Strong 
Jonathan-un eojebam-e sesigan dokseorul haetda. ku 
Jonathan-roP yesterday night-at three hours reading did. ku 
soseolchayk-i / #soseolchayk-i jaemi-itdago saengakhaetda. 
novel-NoM / novel-nom interesting thought. 


‘Jonathan read for three hours last night. (He) found the novel 
interesting’ 


PInterestingly, this same form can also be used to introduce new discourse referents, as can be 
seen in the first sentence of (17b); see Irani (2019) [in this volume] for a fuller analysis. 

“See Ahn (2017) for a recent proposal that Korean actually makes a three-way split, further 
extending the typological picture. 
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A final case (at least as far as the present discussion is concerned) of a language 
that has been argued to feature an overt form, namely a specific classifier con- 
struction, that parallels strong article definites, vs. bare nouns to express weak 
article definites, is that of Thai. 


(19) Thai (Jenks 2015: 109) 

a. Weak 
rot khan nan thüuk tamruatsakat phrö? ` máj.dáj tit 
car CLF that aApv.pas police intercept because NEG attach 
satikaa wáj thii thabian (#baj nan). 
sticker keep at license — crr that 
"Ihat car was stopped by police because there was no sticker on the 
license: 

b. Strong 
fool khít waa kloon bot nan prs? maak, mée-wáa kháw cà 
Paul thinks comp poem cLF that melodious very, although 3s6 IRR 
maj chóop naktéenkloo #(khon nan). 
NEG like poet cLF that 


‘Paul thinks that poem is beautiful, though he doesn’t really like the 
poet. 


A rather different instantiation of the weak vs. strong article contrast can be 
found in Icelandic. While the definite article generally appears as a suffix on the 
head noun, this suffixation is blocked by a certain class of evaluative adjectives. 
Ingason (2016) shows that the free form hinum, which had previously been con- 
sidered as archaic, can occur in such cases in the modern standard, but only if 
we are dealing with a weak article definite. Strong article definites in such cir- 
cumstances can only be expressed by the demonstrative pessum. 


(20) Icelandic (Ingason 2016: 108, 131) 
a. Weak 


Context: The speaker is annoyed that she always loses. There is only 
one winner per round. 
Alltaf eftir hverja umferð eru spilin gefin aftur af 
always after each round are cards.the given again by 
[pp hinum óþolandi sigurvegara]. 
HI-the yeak intolerableeyaluative Winner 


‘Always after each round, the cards are dealt again by the intolerable 
winner. 
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b. Strong 
Previous discourse: Mary talked to a writer and a terrible politician. 
She got no interesting answers from... 
...pessum / #hinum hræðilega stjóramálamanni. 
this / HlI-theyeak terribleevaluative politician 


Another case where adjectives crucially feature in the expression of the weak 
vs. strong contrast, though in a different way, is Lithuanian (Šereikaitė 2019 [in 
this volume]). It exhibits a definite suffix that appears on adjectives, but only 
when they are of the strong article definite variety. In cases of uniqueness-based 
definites, the adjective will form a noun phrase with the noun without this suf- 
fix. Interestingly, such "bare" forms also have indefinite uses. Furthermore, the 
suffix has a much wider distribution, and can also appear on demonstratives and 
pronouns, among others. This wider distribution, as well as more intricate varia- 
tions in the range of uses involving kind reference, deserve much more detailed 
attention, but at this point it seems safe to say that at least part of the contrast be- 
tween bare and definite-suffixed forms seems to track the weak vs. strong article 
definite contrast. 


(21) Lithuanian (Sereikaite 2019 [in this volume]) 
a. Weak 

Praėjus dviem savaitém po rinkimų, prezidentas turi teisę atleisti 
Passed two weeks after elections president has right fire 
naują / tnaujq-jj ministrą pirmininką tik išskirtiniais atvejais. 
new / new-DEF minister prime only exceptional cases 
‘Two weeks after the election, the president has a right to fire the 
new prime minister only in exceptional cases. 


b. Strong 
Knyga “Lietus” sulaukė nejtikétino populiarumo, nepaisant to, kad 
Book ‘Rain’ received incredible popularity despite that 


talentingas-is — /*talentingas rašytojas nusprendė likti 
talented-DEFstrong / talentedweak writer decided remain 
anonimas. 

anonymous 

"Ihe book ‘Rain’ became incredibly popular despite the fact that the 
talented writer decided to remain anonymous: 


While this overview can only be cursory, given space constraints, the rela- 
tively minimal pairs of examples from this range of largely unrelated languages 
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should illustrate that key phenomena concerning the weak vs. strong-article def- 
inite contrast are mirrored by formal distinctions between different types of def- 
inite noun phrases cross-linguistically. There are two key questions, both from 
a theoretical perspective and for pursuit in future research on definites across 
languages: a) how does the formal expression of the contrast vary across lan- 
guages and how does this variation relate to the core meaning contrast? b) to 
what extent is the contrast the same across languages, and to what extent, and 
in what form, do we find variation in this regard. I turn to some - necessarily 
preliminary - considerations in the following section. 


4 Variation in form and meaning 


4.1 Variation in form 


Starting with variation in the form of how the contrast between weak and strong 
article definites is expressed, an initial generalization, from the perspective of 
the analysis of Schwarz (2009), seems to be that a more in meaning is gener- 
ally reflected in a ‘more’ in form: the weak article definites in German and re- 
lated dialects all involve morpho-phonologically reduced forms, e.g. contraction 
in Standard German. In the Germanic dialects with two full article paradigms, 
weak article forms also seem to be less complex than strong article ones. And in 
many languages, of course, this situation descriptively holds in the extreme, as 
weak article definites are expressed with bare noun phrases. 

Two particularly interesting cases with regards to the formal realization of 
the contrast are Icelandic and Lithuanian. In Icelandic, the same nominal suffix 
is used to express both types of definites in most contexts. Only when, in the anal- 
ysis of Ingason (2016), suffixation is blocked by evaluative adjectives do we find 
a distinction, such that an otherwise archaic free-form article is used for weak ar- 
ticle definites. While at first sight, this seems perhaps at least in one sense more 
complex than the default configuration, strong article definites cannot be real- 
ized by the default form in that case either, but instead call for a demonstrative 
(which is more complex). 

Turning to Lithuanian, the perhaps most notable point is that the explicit indi- 
cation of definiteness occurs neither on the noun itself or at the level of a (poten- 
tial) D-head, but rather in the form of a suffix on adjectives between these two. 
The formal relation between this suffix and a potential null D-head of course 
constitutes one key question in this regard, and there seem to be arguments in 
favor of a DP-layer for both cases, contrary to what has been said about, e.g. 


20 


1 Weak vs. strong definite articles: Meaning and form across languages 


Serbo-Croatian, where the formal realization otherwise seems somewhat similar 
(Sereikaité 2016). In addition, it bears repeating that the same suffixal form that 
we find on adjectives can also appear in various other places, most relevantly 
pronouns and demonstratives. While in principle, the effect there does not seem 
to be dissimilar, the details are not obvious and require much more extensive 
exploration. 

Returning to the more general issue of meaning and form, the apparent gener- 
alization about the formal realization of the distinction should be taken seriously 
and relates to key choice points in the semantic analysis of the article contrast: 
if we want to capture the relationship between both the forms and meanings in- 
volved in such a way that one is in some way derived from, or an extension of, the 
other, then this would call for broader proposals of the sort put forth by Grove & 
Hanink (2016) and Hanink (2017), briefly discussed above, which extend to cases 
oflanguages with two full article paradigms. On the other hand, if we assume two 
distinct lexical entries for weak and strong articles, than the generalization about 
the forms involved would have to be explained in another way, e.g. from the per- 
spective of historical development, which could see the morpho-phonologically 
less complex forms as more grammaticalized or bleached, perhaps in parallel to 
the relation between demonstratives and definite articles more generally (Lyons 
1999). 

The fact that many languages use bare noun phrases for the weak article also 
relates to this question, of course, as well as to key issues in DP-syntax. In partic- 
ular, the question arises of whether or not a determiner-level is present in these 
noun phrases in the first place, and if so, why it is the weak article meaning 
that can standardly be realized as phonologically null. Alternatively, a common 
move is to assume that purely semantic type-shifters can do the job of (both defi- 
nite and indefinite) articles when overt forms are lacking (Partee 1986; Chierchia 
1998; Dayal 2004). This then raises questions about the interplay between the 
determiner-inventory in the relevant languages and the constraints for the ap- 
plications of such type-shifters. Furthermore, since the null-hypothesis for such 
type-shifters clearly would be that their effect is universal across languages, any 
variation in the interpretive options of bare noun phrases that cannot be ac- 
counted for in terms of the determiner system of the language in question, e.g. 
in terms of blocking effects from available overt forms, would seem to support 
the notion that distinct lexical determiners with the same phonologically null 
form can in principle be available, in contrast to what is commonly argued by 
proposals based on type-shifters (for recent discussion, see Dayal 2016). 
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Of particular importance in this regard is the potential case of languages which 
exhibit a genuine ambiguity between definite and indefinite interpretations for 
bare noun phrases. Initial evidence in relevant discussion of, e.g. Akan (Arkoh & 
Matthewson 2013), Lithuanian (Sereikaite 2019 [in this volume]), and ASL (Kouli- 
dobrova 2012; Irani 2019 [in this volume ]) suggests that this is a possibility, contra 
the type-shifter based proposal by Dayal (2016), but further scrutiny is needed, 
both empirically and in terms of integrating the article-contrast issues into the 
broader theoretical picture." 


4.2 Variation in meaning 


While in the data so far the semantic contrast arguably can be seen as entirely 
uniform, it is undeniable that there is some degree of variation in this regard as 
well. Some of it consists of fairly detailed aspects, including what forms are used 
in certain cases where the contextual constraints for anaphoric uses or situational 
uniqueness are met, and in some cases additional distinctions involving other 
features may be at play as well. Generally speaking, these cases are consistent 
with the semantic analysis of the contrast laid out above, but involve differences 
in what form winds up being preferred given a certain type of context. But there 
also seems to be more substantial variation, which may require reconsidering 
the broader theoretical set of options. Some illustrations of the former cases are 
provided in the remainder of this section, while I turn to the latter in the next 
section. 

One point of more subtle variation concerns anaphoric usage in longer narra- 
tive texts. A central character of a story (e.g. a fisherman, as in the Fering story 
considered by Ebert 1971b) may be introduced with an indefinite, and then ini- 
tially picked back up by a strong article definite. But as the central role of the 
character becomes clear in the narrative, one may then switch to using weak 
article definites for it. In contrast, according to intuitions reported by Anton In- 
gason (p.c.), Icelandic would keep using the form corresponding to the strong 
article definite in this situation. But while the conditions for anaphoric uses are 
met, the central role of the character in question may also suffice to provide con- 
textual restriction to ensure uniqueness of that entity. 

Another point of variation concerns contexts involving entities which are both 
unique and familiar (at least in a weak sense) in the broader non-linguistic con- 


One important question in this discussion is what counts as an "article-less" language for the 
purposes of generalizations made by such proposals: where do languages which express weak 
article definites with bare noun phrases, but have an explicit determiner form for strong article 
definites, fall? 
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text, e.g. with regards to a family dog. Akan and German seem to differ here, in 
that the former chooses to use the overt strong article, whereas German prefers 


the weak article form.!® 


(22) Context: You and your spouse own one dog. While your spouse is away, 
someone breaks into your house and you are telling them about it on the 
phone. You say: 


a. German (Arkoh & Matthewson 2013: 19) 
Der Einbrecher ist zum Glück vom / #von dem Hund 
the burglar is to theye4j luck by_theweak/ by thestrong dog 
verjagt worden. 
chased been 


'Luckily, the burglar was chased away by the dog. 


b. Akan (Arkoh & Matthewson 2013: 19) 
Owifó nó, bàdóm nó  kà-á nó-dó árá má 
thief DEF dog DEF follow-PAST 3sG-OBJ-on just so 
0-güán-ii. 
3sG.sBJ-run-PAST 
"Ihe thief, the dog chased away: 


But as before, the fact that conditions for situational uniqueness are met and 
an anaphoric form is used is not incompatible with the formal analysis. AU that 
is required for a strong article definite is that its index receives a value from 
the assignment function. When an entity such as a family dog is familiar in a 
context, that may suffice to establish that, parallel to how personal pronouns 
can be used in similar situations, e.g. by parents who have a single boy who 
can be referred to as he without any recent prior mention. But nonetheless, the 
question, of course, needs to be addressed just why a language like Akan should 
differ precisely in that regard from other languages. One possibility is that the 
availability of indefinite uses of plays a role here; this will need to be tested with 
regards to other languages with similar properties. 

Contexts of situational uniqueness bridging also seem to exhibit some varia- 
tion. For example, Wespel (2008) cites Amern data from Heinrichs (1954), show- 
ing that the strong article is used in the following example for the noun phrase 
headed by altars, even though it is clearly part of the aforementioned church. 


éMauritian Creole may be similar to Akan in this regard; see Wespel (2008: 189-190). 
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(23) Amern (Heinrichs 1954: 99) 


Vör woran en da ndldarkerak on wolan os dns di altóórs 
we were in DEF o-N church and wanted us once DEF.PL strong altars 
bekika. 

look-at 


“We were in the church of Waldniel and wanted to have a look at the 
altars. 


The extent to which this is compatible with the formal analysis at least in part 
depends on the properties of the nouns in question, in particular with regards 
to the possibility of them receiving a relational meaning, as relational nouns 
in principle will open up to anaphoric bridging with the strong article, parallel 
to the book-author cases considered above. Interestingly, other languages have 
been argued to exhibit inter-speaker variation precisely in this regard: Ortmann 
(2014) reports data from Upper Sorbian, which seems to at least in part reflect 
generational variation such that, for some speakers, the strong article ton is not 
obligatory in cases like the following, while it is obligatory across the board in 
cases parallel to the book-author examples. Additionally, Ortmann reports par- 
allel judgment patterns in Upper Silesian to be extremely hard to ascertain em- 
pirically. 

Yet another dimension of potential minor variation involves additional distinc- 
tions. In particular, Ahn (2016) reports a 3-way split in Korean, with an additional 
form specialized for genuinely deictic uses (which are commonly available for 
strong article forms in other languages as well). 

In sum, there is clear evidence of what can be considered fairly minor variation 
in the article contrast across languages, which in principle is consistent with 
the semantic characterization provided, but calls for further explanation of why 
languages should make different pragmatic choices about which article to use in 
a given type of context. Additionally, further and more fine-grained distinctions 
extending beyond the weak-strong contrast seem to exist as well. While much 
more needs to be explored, this data at least in principle seems to be amenable 
to explanation within the general approach outlined above. 
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5 Beyond weak vs. strong 


5.1 Different semantic contrasts 


In addition to what we saw in the previous section, there are other languages 
that seem to diverge in more substantial ways in the way that they exhibit a 
contrast between different types of definite articles. For example, while Haitian 
Creole is superficially similar to Mauritian Creole, and both have French as their 
main source language, the contrast between definite noun phrases marked with 
la (derived from the French definite article, as in Mauritian Creole) and bare ones 
seems different from what we have seen before." First, parallel to the Amern 
data above, there seems to be no contrast between different types of bridging, 
and both situational and anaphoric bridging use the overt form (here realized as 
la or a): 


(24) Haitian Creole (Wespel 2008: 114; source: E.F.32, E.F.36.9) 


a. Weak article definite context 
Ye, mwen viste yon vil provens. Meri a pi wo 
yesterday I visit one town province town-hall DEF more high 
ke  legliz la. 
than church DEF 


"Yesterday I visited a town in the province. The town hall was higher 
than the church: 

b. Strong article definite context 
Elite renmen liv la, e kounyea li we rankontre otè 
Eli par love book DEF and now DEF she want meet author 
a. 
DEF 


‘Eli loved the book, and now she wants to meet the author: 


Similarly, larger or immediate situation uses (in the terminology of Hawkins 
1978), which in other languages call for the weak article or equivalent, also gen- 
erally call for the overt form. The bare form is only used for what Wespel calls 
complete functional descriptions, i.e. cases where the head noun denotes a func- 
tion and its relatum argument is explicitly introduced, as in (25), which, as Wespel 
spells out in some detail, does not involve a possessive construction of any sort. 


Potential other candidate languages fitting this category include Bangla (Simpson & Biswas 
2016) and Jinyun (Simpson 2017), though further research is needed to compare these various 
cases in more detail. 
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(25) Haitian Creole (Wespel 2008: 98) 
papa Mari 
'the father of Mary’ 


This situation seems very much at odds with the weak vs. strong article con- 
trast as spelled out above. To begin with, global uniques (such as the sun) are core 
cases for the analysis in Schwarz (2009). The split between these and "complete 
functional descriptions" is also rather puzzling from that perspective. One sen- 
sible reaction might be to take this to reflect a fundamentally different contrast, 
and I will explore some potential avenues for such a move below. But even if this 
were successful, it would leave us with vexing questions about how this state of 
affairs came about, especially given the fairly minimal pair of two French-based 
creoles that both retain a form based on French la, but use it in apparently very 
different ways.'® 

Turning to potential directions for alternative characterizations of the Haitian 
Creole contrast, some rather suggestive examples are discussed by Wespel (2008). 
In particular, the presence or absence of la seems to relate to the introduction 
of the domain of only (and parallel effects exist for superlatives). In particular, 
when the domain of only is explicitly restricted by a post-nominal prepositional 
phrase, such as ‘in his family’, then no la (or allomorph) appears on the noun 
phrase associated with only (26a). In contrast, when this prepositional phrase is 
used as a framing adverbial, and not in the scope of only, then the overt article 
form does appear (26b). 


(26) Haitian Creole (Wespel 2008: 118-119; source: E.F.76.20.a, E.F.76.20.b) 
a. Pyése sél gason nan fanmi li. 
P coPonlyboy in family his 
‘Peter is the only boy in his family’ 
b. Fanmisa a, se yon gwo fami, men Pyése sel gason an. 
family DEM DEF COP INDF big family but P cop only boy DEF 


“This family is big, but Peter is the only boy. 
Given this suggestive data, one potential avenue to explore, building on the 


proposal by Wespel (2008) that la indicates the use of a "resource situation vari- 
able", is that it is the overt realization of a situation pronoun in the sense of 


3 Another interesting potential consequence of such a move, which I am not able to explore 
here in detail, is that this would seem like another case of genuine variation in the type of 
definiteness involved with bare noun phrases, which would come as somewhat surprising for 
type-shifting based accounts of such noun phrases, again under the assumption that what 
type-shifters can do is universal. 
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Percus (2000). Formally, a candidate requirement introduced by this particular 
type of situation pronoun could be that it is not identical to the topic situation 
relative to which its clause is evaluated.” The idea would then be that (certain) 
overt phrases, such as the prepositional phrase ‘in the family’ in (26a) as well 
as relatum DPs in functional descriptions such as (25), are an alternative way of 
specifying the value of this situation variable, making the overt article form un- 
necessary. Interestingly, there also seems to be some variation in the presence of 
the overt form corresponding to the difference between situational uniqueness 
through common knowledge vs. anaphoricity (27); however, much more work is 
needed to flesh out the full empirical picture here. 


(27) Haitian Creole (Valdman 1977: 116) 


a. Kote manje mwen? 
where meal my (interpreted relative to topic situation?) 


b. Kote manje mwen an? 
where meal my DEF (based on previous mention) 


"Where is my meal?’ 


Theoretically, there are additional further implications of this type of approach 
as well. For example, global uniques would have to be assumed to require a sit- 
uation pronoun (with a value distinct from the topic situation). Potentially in- 
teresting predictions arise with regards to intensional contexts, where situation 
pronouns fill the additional role of determining the intensional status of a given 
noun phrase (e.g. in terms of the de re/de dicto contrast). In this regard, the fact 
that /a can occur on entire clauses as well would also be of further interest. And 
as already mentioned, the relationship between what happened to French-based 
la over time in Haitian and Mauritian Creole seems like a rich and important 
issue to explore. From the perspective just sketched, we might be dealing with 
a situation where the two take rather different paths to superficially similar but 
underlyingly distinct systems, roughly corresponding to the difference between 
representing anaphoric individual variables (as part of the strong article mean- 
ing) and representing variables for situations in the form of situation pronouns. 

In sum, the case of Haitian Creole, which likely is mirrored in other languages 
as well, goes beyond what might be characterized as mere pragmatic variations 
in how the same meanings are put to use in the system of a given language, as re- 
flected, e.g. in the lack of a bridging contrast in languages like Amern. A striking 


Note that the analysis of English demonstratives by Wolter (2006) develops some strikingly 
similar ideas for a different set of empirical facts. 
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observation, from the present perspective, is that even global uniques come with 
the overt form. The main question moving forward then will be to what extent 
the pattern represented here by Haitian Creole might reflect a fundamentally 
different type of contrast, or whether there are other languages that could be 
seen as further in-between cases, with a mix of the properties of the languages 
discussed in previous sections and cases like Haitian Creole. If the latter were 
the case, this might suggest that we are dealing with a more gradient spectrum 
after all, which would require some fairly substantial reconsiderations for an ap- 
proach based on the formal article contrast as laid out above. I briefly review and 
comment on such a more gradient account in the following section. 


5.2 Semantic vs. pragmatic uniqueness 


A prominent alternative analysis goes back to Lóbner (1985), with more recent 
developments in Löbner (2011) and, of particular relevance for our purposes, a 
fairly extensive typological discussion in Ortmann (2014). The core idea rests on 
a distinction between semantic and pragmatic uniqueness, which crucially rides 
on whether context has any role in establishing uniqueness. More specifically, 
semantic uniqueness holds if a definite description refers unambiguously based 
on the meaning of the noun alone, in a context-independent manner. In contrast, 
in cases of pragmatic uniqueness, reference is unambiguous only under consider- 
ation of contextual information, which can be linguistic or extra-linguistic. Cru- 
cially, this distinction is seen relative to a gradient uniqueness scale, which al- 
lows different languages to choose different cut-off points for using one form as 
opposed to another. Ortmann (2014) succinctly states the role of these notions 
for article contrasts (or "splits"): 


[...] the distinction between semantic and pragmatic uniqueness is the ba- 

sis of all conceptually governed article splits, in that a shift towards an IC 

[Individual Concept] or FC [Functional Concept] is overtly signaled. 
(Ortmann 2014: 296) 


The approach crucially rests on the assumption that nouns differ lexically from 
one another with regards to their semantic types. Table 3 provides an overview 
ofthe key dimensions of variation, namely a) whether their meanings are at their 
core referential (ending in type e) or predicative (functions from a given number 
of individuals to truth values). 

However, the type of nouns can be adjusted through (fairly standard) type- 
shifting operations. Definite noun phrases are generally analyzed as functional 
concepts, in that they are assumed to refer unambiguously. However, that status 
is attained in different ways, in that some nouns require a type-shifter, and oth- 
ers do not. The difference between two distinct definite articles is then captured 
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Table 3: Semantic vs. pragmatic uniqueness (adapted from Ortmann 


2014) 

Monadic Polyadic 
Non-unique Sortal nouns Relational nouns 
(pragmatic) dog, stone sister, finger 

<e, D (e, <e, Di 
Unique Individual nouns Functional nouns 
(semantic) sun, prime minister father, head 

<e> (e, e) 


in terms of the signal they convey about how uniqueness was achieved. For ex- 
ample, the idea for Standard German would be that the strong article indicates 
pragmatic uniqueness, whereas the weak article indicates semantic uniqueness. 

This idea is made more flexible by the notion that different types of noun 
phrases relate to the context in different ways. Based on this, the approach as- 
sumes a scale of uniqueness, “defined according to the degree of invariance of 
reference of nominal expressions” (Ortmann 2014): 


(28) Scale of uniqueness (Ortmann 2014: 314; adapted from Löbner 2011) 
deictic sortal noun < anaphoric sortal noun < SN with establishing relative 
clause < relational Definite Associative Anaphora < part-whole Definite 
Associative Anaphora, non-lexical functional nouns, < lexical individual 
nouns/functional nouns < proper names < personal pronouns 


Essentially, a language with a contrast between definite articles could then 
draw the line anywhere on this scale, marking expressions to one side with a 
weak article and those to the other side with the strong article. Intuitively, the 
idea is that different nouns require different amounts of lifting to end up with the 
right semantic type for a definite description, and the articles serve as indicators 
of whether a certain amount of lifting had to occur. The approach naturally af- 
fords a substantially more fine-grained set of typological options than any simple 
binary contrast. 

While not all relevant aspects of this proposal can be discussed here, let us 
briefly assess both challenges and strengths of this general approach. 

Starting with the former, there is a question at the level of the general ar- 
chitecture of the syntax-semantics interface with regards to the mapping from 
syntactic categories to semantic types. While it is clear that we have to allow 
for some flexibility, e.g. with regards to the number of arguments a given predi- 
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cate involves, sub-dividing the space of lexical entries for nouns into predicates 
and entities gives rise to additional complications. These are by no means insur- 
mountable, but their repercussions have to be assessed carefully. On the flipside 
of the coin, determining the availability of the type-shifters that are standardly 
invoked for dealing with these complications has to be carefully constrained. An- 
other aspect that requires further spelling out is the nature of the measure on the 
uniqueness scale, especially as new potential contrasts are considered based on 
new data from additional languages. On the semantic side, the question arises of 
how cases where there is a clear overall meaning contrast based on which article 
is used are captured in the formal derivation if the articles themselves do not 
contribute any meaning. Finally, the specification of the key notions of unique- 
ness tries to characterize unambiguous reference relative to the denotation of the 
noun (since it is based on lexical properties), rather than the full noun phrase. But 
this does not translate straightforwardly to cases of more complex noun phrases, 
where traditional uniqueness-based analyses crucially rely on the compositional 
combination of the determiner with its complement noun phrase as a whole (e.g. 
including modifying adjectives). Relatedly, it is not obvious how the broader in- 
tegration of this approach into a formal semantic system that interacts with the 
grammar should proceed, specifically with regards to the various mechanisms 
for co-variation under quantifying expressions briefly discussed above. 

There are empirical problems for this type of approach as well. In particular, 
sortal nouns of various kinds can be turned functional through appropriate con- 
texts — as illustrated by the following variation on (7b) (where a strong article 
was required): 


(29) German 

Context: Hans, who works at a ministry, and his wife are talking about 

what has been going on at work. 

a. What happened to the proposal you drafted? 

b. Der Vorschlag wurde in der Kabinettssitzung gestern vom, 
the proposal was in the cabinet meeting yesterday by they 
Minister vorgestellt, aber 7 SPD-Minister haben dagegen gestimmt. 
minister introduced but 7 SPD-ministers have against voted 
"Ihe proposal was introduced by the minister in yesterday's cabinet 
meeting, but 7 SPD-ministers voted against it’ 


Crucially, nothing about the noun in such cases ensures uniqueness directly, 
and to the extent that uniqueness does hold, that only is so based on a substantial 
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amount of contextual information - in essence, the entire definite noun phrase is 
interpreted relative to the speaker's work place here. But surely such a contextual 
modulation should not lead us to consider different lexical entries for the word 
‘minister’.?° 

Let us now turn to some of the strengths of this proposal. First, as already 
noted above, it allows for a substantial range of variation between languages 
along a single dimension, and Ortmann (2014) applies the resulting prediction 
in interesting ways, both synchronically and diachronically. But even as that 
success should be registered, it is worth noting that the formal proposal on its 
own predicts that languages should be able to choose a cut-off point anywhere 
on the scale. In light of the variation present in existing data, it seems that even 
though some flexibility is needed, the full range of options goes beyond what is 
required (of course this could change with additional data being brought under 
consideration). 

In relation to these concerns, it is also worth revisiting some aspects of Haitian 
Creole in light of the analysis in terms of semantic vs. pragmatic uniqueness. The 
uniqueness scale has global uniques on par with functional nouns with explicit 
arguments. But Haitian Creole crucially draws a line between these two, and any 
plausible additional split of the uniqueness scale would predict an opposite order- 
ing from what is empirically attested in this regard. Furthermore, the intriguing 
interaction of la with the domain of only would not seem to be something that 
can be explained in any straightforward way from this perspective. 

In sum, accounts based on the distinction between semantic and pragmatic 
uniqueness do have some desirable empirical predictions going for them, but they 
also face some challenges, both conceptually and theoretically. In light of this, it 
should be clear that accounting for the full range of article variation across lan- 
guages requires substantially more work, regardless of the theoretical approach 
one starts out with. But the empirical picture overall is not incompatible with a 
view where the core weak vs. strong contrast is mirrored in properties of arti- 
cle contrasts across many languages, but various other, potentially independent, 
factors can affect just what form is thought to be ideally suited for the purposes 
at hand. 


?°Note also that this is clearly a different contrast than that in the sketch of Haitian Creole above, 
where resource situation would require a strong article. 
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6 Conclusion 


In this chapter, I have reviewed the key tenets of the contrast between weak 
and strong article definites presented in Schwarz (2009), and considered a range 
of data across various languages in light of it. There seems to be a substantial 
number of languages from entirely unrelated language families that use differ- 
ent forms for different types of definite noun phrases in a way that seems to 
reflect the weak vs. strong article contrast found in Germanic. While there are 
some minor variations in the pragmatics of which forms get used when both are 
available, the nature of the semantic contrast in a large set of languages seems to 
be fairly uniform and consistent with an analysis in terms of situational unique- 
ness and anaphoricity. In addition, the formal realization of the contrasts was 
considered, and there is at least preliminary evidence from the languages dis- 
cussed that there is real variation in the interpretation of bare noun phrases, in 
a way that suggests that distinct null D-heads may be at play in at least some of 
them. 

Additional languages enriched the picture further, as they exhibit contrasts 
that clearly seem to go beyond the weak vs. strong contrast. There are two pos- 
sible approaches to tackling this. First, one can see these languages in terms of 
orthogonal factors, providing insights into potentially related, but ultimately sep- 
arate dimensions of variation. Alternatively, one can see them in terms of a more 
gradient perspective on how different types of definites are signaled within a 
grammar, as on the approach based on semantic vs. pragmatic uniqueness. Both 
types of approaches require extensions and elaborations, so more work is needed 
both empirically and theoretically to achieve a more conclusive assessment of the 
semantic typology of definiteness across languages. However, the sharpening of 
key descriptive notions and crucial contrasts goes a long way towards having 
more precise tools that can help to get a more uniform and broad cross-linguistic 
perspective on the nature and extent of variation. 
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Chapter 2 


Definiteness in Cuevas Mixtec 


Carlos Cisneros 
University of Chicago 


Languages vary widely in in their morpho-syntactic strategies for marking definite- 
ness within the noun phrase. Schwarz (2009; 2013) and Jenks (2015) find that these 
strategies often correspond to distinct characterizations of the semantics of definite 
descriptions. Many languages feature distinct mechanisms for expressing definite 
descriptions as either unique or familiar, such as by having two distinct classes of 
definite article or by contrasting definite bare nominals with some form of overt 
definite marking. Cuevas Mixtec shows that a language can also feature internal 
variation in the marking of either uniqueness or familiarity. Most nominals of this 
language are capable of taking on bare forms for the expression of uniqueness, 
while familiarity is expressed using overt definite articles. There are some nom- 
inals, however, which never combine with overt definite articles or which must 
take on definite articles in a larger set of semantic environments. The variation 
observed here seems to be tied to etymological factors within the nominal and the 
influence of an animacy hierarchy. 


1 Introduction 


Recent literature on the proper characterization of definiteness shows that lan- 
guages vary widely in the strategies they utilize for its expression, from bare 
nominals to the occurrence of definite articles or even demonstratives. Schwarz 
(2009) and Jenks (2015) show that when languages feature more than one strat- 
egy for the expression of definiteness, the variation exhibited semantically corre- 
sponds to distinct notions of definiteness itself. One class of definite expressions 
will encode uniqueness of an individual, such that the descriptive content con- 
veyed by the nominal can only be attributed to that individual. Another class 
of definiteness expressions will encode anaphoricity or familiarity, where the 
expression invokes an anaphoric link to a previously mentioned individual in 
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a discourse. Both Schwarz and Jenks find robust cross-linguistic evidence for 
the validity of a non-uniform approach to the characterization of definiteness, 
given the great diversity of languages that grammaticize the distinction between 
uniqueness and familiarity. However, this is far from being the whole story on 
the nature of definiteness encoding in the nominal domain. Despite the growth 
of investigation on cross-linguistic variation in the expression of uniqueness and 
familiarity, there does not yet seem to be thorough investigation on language in- 
ternal variation in the expression of the distinction. This paper brings to light 
some pertinent details regarding a language with such internal variation, with 
hopes of contributing to the greater account of definiteness across languages. 

Cuevas Mixtec is an Otomanguean language which displays at least two dis- 
tinct strategies for expressing definiteness. The language features a set of definite 
articles that are derived from a noun classifier system. Definiteness may also 
be expressed by bare nouns, which may also have an existentially quantified or 
generic interpretation in some contexts. The example below demonstrates both 
strategies at work, where a nominal isü ‘deer’ is interpreted as a definite descrip- 
tion, referring to the entirety of the group of organisms that are named such. The 
occurrence of the definite article tyí generally restricts the interpretation of isu 
‘deer’ to a definite description, but it is optional in this context so long as another 
determiner type does not replace it. 


(1) indyi’t syàdà  [(tyí) isü] 
end.coMPL account the.AML deer 


"Ihe deer went extinct: 


The examples below show the optionality of the definite article tyà for the ex- 
pression of definiteness on a nominal predicate. Within the village of San Miguel 
Cuevas and surrounding villages, certain festivals are organized by gender-based 
committees led by an administrator of the same gender. There is therefore a 
unique male and unique female administrator for the organization of these fes- 
tivals. In the examples, a character named Juan is being presented as an admin- 
istrator.! The absence of a definite article allows the nominal predicate to be 
interpreted as definite when uttered within the context of the male village festi- 
val committee (or indefinite otherwise). The presence of the article restricts the 
interpretation of the nominal predicate to a definite one, identifying Juan as the 
unique male administrator regardless of context. 


‘There are two words for this occupation in Cuevas Mixtec, mastoni and märtöon, both of which 
seem to have originated as loanwords from other language groups. Both words appear in this 


paper. 
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(2 Context: Juan is presented as the male administrator within a meeting of 
men organizing a village festival. 
a. [tyà Juaan] Fon ^ mastóní 
the.sc.M Juan  be.rPFv administrator 
“Juan is the administrator: 
b. [tyà juáán]káü ^ [tyà mästöni] 
the.sc.M Juan be.ıprv the.sc.M administrator 


‘Juan is the (male) administrator! 


Recent investigations into languages which feature multiple strategies for def- 
initeness expression have shown that the opposition between the styles of defi- 
niteness marking corresponds to a distinction in the notions of definiteness that 
are being invoked. Schwarz (2009) shows that for the particular case of German, 
weak definite articles encode uniqueness, or the quality of uniquely satisfying the 
descriptive content of the nominal relative to a situation, while strong definite 
articles encode an anaphoric link to a previously mentioned individual. Schwarz 
(2013) and Jenks (2015) later show that when a language allows bare nominals 
to serve as definite descriptions, the notion of definiteness expressed similarly 
tends to be that of uniqueness, while the same language utilizes overt definite- 
ness marking for creating anaphoricity. Cuevas Mixtec is shown in this paper to 
be very similar to other languages which allow for bare nominals to have definite 
interpretations, thereby supporting these previous findings. However, the lan- 
guage presents a more complicated picture by displaying internal variation in the 
correspondences between definiteness marking and the notion of definiteness in- 
volved. There seems to be a grouping of nominals into at least three types with 
respect to the strategy for encoding either uniqueness or anaphoricity. There are 
those which follow the pattern of reserving bare nominals for expressing unique- 
ness and utilizing overt marking for familiarity, those which follow an English 
pattern of utilizing overt marking for both uniqueness and familiarity, and those 
which cannot host definite articles at all. 

In the rest of this paper, I cover the necessary background on the study of both 
definiteness and Mixtec to introduce the evidence for the claims made above. In 
82, I briefly introduce the analysis of definiteness marking for languages 
which permit bare nominal definite descriptions by Schwarz and Jenks. I provide 
Schwarz's examples from Standard German used to demonstrate grammatical 
sensitivity to the expression of uniqueness and familiarity. In 83, I then intro- 
duce some background information on Cuevas Mixtec, which will be necessary 
for reading the data. I provide a very brief typological introduction to the lan- 
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guage as well as some background on the speaker community located in west- 
ern Oaxaca, Mexico, and in California. This is followed by an introduction to 
the particular orthography of Cuevas Mixtec that is in development, then by 
a grammatical sketch of the language covering basic word order and the basic 
grammar of noun classifiers. In $4, I then present evidence for the interpreta- 
tion of the definite descriptions of the language as either encoding uniqueness 
or anaphoricity. These are semantic environments where the interpretation of 
a definite description is restricted to either a uniqueness definite or anaphoric 
definite, which mutually exclude each other. In 85, the evidence for the corre- 
spondence between definiteness marking strategy and notion of definiteness is 
used again to present evidence for internal variation with respect to that corre- 
spondence. Different nominals are compared with each other to establish their 
definiteness marking preference in the relevant semantic environments. The pa- 
per then concludes with a summary of the findings. 


2 Definiteness background 


This section introduces the key notions of definiteness that will be shown to char- 
acterize the definite descriptions of Cuevas Mixtec. Both notions correspond to 
early attempts at the characterization of English definite descriptions, or nominal 
expressions with the. Schwarz (2009) shows that both approaches are validated 
by cross-linguistic variation in the semantics of definite descriptions and the 
alternative strategies languages employ to encode definiteness. Cuevas Mixtec 
provides further support for an approach to the semantics of definiteness which 
considers internal variation in the number of strategies for composing definite 
descriptions. 


2.4 Uniqueness and familiarity 


There has been a long debate regarding the most proper semantic characteriza- 
tion of definite descriptions, and two approaches in this respect have been more 
prominent. There is a uniqueness approach, which claims that definiteness is the 
function of referring to an entity that is the unique bearer ofthe property denoted 
by the nominal description. The quality of uniqueness need not be absolute, but 
evaluated relative to some contextual domain or situation. Examples of felici- 
tous uses of English definite articles expressing uniqueness include the president 
of the United States and the Taj Mahal, where each expression refers to a thing 
that uniquely satisfies the nominal description with respect to some domain. In 
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these cases, the domains are quite broad and seemingly absolute, at least when 
constrained to a small scope in time, but expressions like the projector are also 
interpreted as unique within smaller contexts, despite their non-uniqueness in 
the world at large. In the following example, the projector felicitously refers to a 
uniquely identifiable entity when uttered in the context of a lecture hall where 
there is a single projector. When constrained to such a context, there is nothing 
else for the expression to refer to. 


(3) Context: A presentation is about to start within the lecture hall of a 
school. 
The projector is not being used today. (Schwarz 2013: 3) 


It does not matter that there are other projectors in the greater building be- 
yond the lecture hall, which represents a broader domain. There is a communica- 
tive mechanism whereby the speaker constrains a domain so as to ensure the 
uniqueness of the definite description's referent within it. Similarly, expressions 
like the dog or the professor can also be unique within small domains such as a 
family unit or a classroom. In predicate logic, the condition of uniqueness can 
be expressed as universal quantification over the equivalence of referents of a 
nominal predicate. 


(4) 3x[P(x) & vy[P(y) = y = x] 
"Ihere is an x that is P and all y that are P are identical to x' (Schwarz 
2013: 3) 


The second common approach to characterizing definiteness is the familiarity 
approach, which claims that definiteness is the function of referring to an entity 
that is familiar or salient to discourse participants. Researchers have touched on 
a number of ways that familiarity itself could be characterized, such as percep- 
tual accessibility or salience in cultural institutions. Roberts (2003) distinguishes 
between two kinds of familiarity, weak and strong, which outline the distinct no- 
tions of familiarity according to linguistic input. Weak familiarity corresponds 
to a broad variety of mechanisms for identifying the referent of an expression be- 
yond linguistic input. Strong familiarity is more precise by its characterization 
as the function of creating an anaphor to a previous linguistic expression in a 
discourse. The following example illustrates this usage, in which the book is an 
expression used to further comment on an entity already introduced earlier by a 


book. 


(5) John bought a book and a magazine. The book was expensive. (Schwarz 
2013: 3) 
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As an anaphoric expression, it is important for the definite description to be 
preceded by an antecedent, served by the expression a book in the previous ex- 
ample. Without the antecedent, the definite description lacks a referent to refer 
to for the anaphoric usage, and the expression will become awkward, as in the 
example below. 


(6) John bought a newspaper and a magazine. *' The book was cheap. 


The anaphoric use of definite descriptions can be semantically modeled as an 
elaboration on their uniqueness usages with an additional condition. Schwarz 
(2009) claims that familiarity definites feature an additional index argument 1 
which receives an interpretation from an assignment function g. The assignment 
function in turn maps the index to the individual introduced by an appropriate 
indefinite, essentially building a pronoun into the meaning of the definite de- 
scription. 


(7) ix.P(x) & x = g(1) 
"Ihe unique x that is both P and identical to the individual interpreted from 
the assignment function g on the index 1' 


Recent literature on definiteness has been more concerned with strong famil- 
iarity, and since this notion is more relevant to the discussion of definiteness in 
this paper, it will be referred to simply as familiarity throughout. 


2.2 Weak and strong articles of German 


Cross-linguistic investigations on definiteness in general have found good evi- 
dence for the adequacy of both approaches outlined above, with some languages 
even distinguishing the two characterizations of definiteness grammatically. 
Schwarz (2009; 2013) shows that various Germanic languages which feature two 
distinct classes of definite article exhibit a correspondence between the definite 
articles’ meanings and the two dominant analyses of definiteness. For example, 
Standard German features two distinct classes of definite article whose morpho- 
logical differences are apparent by their interaction with prepositions. Standard 
German strong articles like dem in the example below resist morphological fu- 
sion with the preposition, while weak articles fuse with prepositions. The articles 
are otherwise similar in appearance and pronunciation. 


(8) German (Schwarz 2009: 14) 


a. Hans ging zu dem Haus. 
Hans went to thestrong house 


‘Hans went to the house. 


44 


2 Definiteness in Cuevas Mixtec 


b. Hans ging zum Haus. 
Hans went to.theweak house 


*Hans went to the house: 


Schwarz finds a distinction in the meanings each class of definite article con- 
tributes. Weak articles are uniqueness definites that highlight a relatively unique 
individual and generally cannot be used to compose anaphora. In a sentence such 
as (9), the weak article establishes the relative uniqueness of the referent of Mond 
‘moon’ in a broad domain such as Earth. 


(9) German (Schwarz 2009: 40) 
Armstrong flog als erster zum Mond. 
Armstrong flew as first.one to.theweak moon 


‘Armstrong was the first one to fly to the moon’ 


Schwarz also finds that strong articles are familiarity definites, which create 
an anaphoric link between a definite description and its antecedent, and they 
cannot create reference to an individual that has not yet been mentioned in the 
discourse. The strong article therefore creates an anaphoric link between the 
two utterances of Buch ‘book’ in the example below, such that the utterances 
refer to the same individual. For comparison, the weak article lacks the necessary 
anaphoric properties required to link the two utterances of Buch to a common 
referent, a book about sunchokes (Topinambur). It does not help either that the 
referent of Buch is not very unique in the context of the New York Public Library. 


(10) German (Schwarz 2009: 30) 
In der New Yorker Bibliothek gibt es ein Buch über Topinambur. 
in the New York library exists ExPLa book about topinambur 


Neulich war ich dort und habe ttim /indem Buch nach einer 
recently wasI there and have in.theweak / in thestrong book for an 
Antwort auf die Frage gesucht, ob man Topinambur grillen 
answer to the question searched whether one topinambur grill 

kann. 

can 


'In the New York Public Library, there is a book about topinambur. 
Recently, I was there and searched in the book for an answer to the 
question of whether one can grill topinambur: 


The strong article itself also becomes awkward when combined with nomi- 
nals without an antecedent. In the example below with Bürgermeister ‘mayor’, 
there is no previous mention of a mayor to serve as an antecedent to the definite 
description. 
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(11) German (Schwarz 2009: 40) 
Der Empfang wurde vom /#vondem ` Bürgermeister eröffnet. 
the reception was by.theweak / by thestrong mayor opened 


"Ihe reception was opened by the mayor: 


Schwarz thus presents strong evidence for a correspondence between the no- 
tion of definiteness (i.e. uniqueness, familiarity) and the morpho-syntactic real- 
ization of the definite article in Standard German. Similar results are robustly 
exhibited for the distinct types of definite article in another Germanic language, 
Fering, with data from Ebert (1971). 


2.3 Bridging 


Before moving on to the discussion of definiteness marking in languages outside 
of the Germanic family, it is worth noting a final set of discourse environments 
where preferences between weak and strong articles have been displayed. When 
Hawkins (1978) set out to understand the semantic source of grammatical dif- 
ferences between definite and indefinite descriptions, he laid out a preliminary 
taxonomy of the distinct uses of the definite article to be later accounted for in 
linguistic models. From the taxonomy, anaphoric uses were those which inspired 
the familiarity approaches to the semantics of definite descriptions. Additionally, 
immediate situation uses and larger situation uses were those which inspired the 
uniqueness approaches, differentiating between smaller and larger spaces within 
which uniqueness is evaluated. If the domain within which uniqueness is evalu- 
ated is a current and localized space where the utterance occurs, this usage may 
be described as an immediate situation use. If the domain is instead a broad or 
global one, considering large expanses of space beyond the utterance situation, 
this usage may be described as a larger situation use. Schwarz uses these dis- 
course environments to test preference for weak or strong articles within nomi- 
nal expressions and finds clear correspondences between semantic environment 
and article preference. 

Hawkins also discussed a fourth usage which has seen mixed results in 
Schwarz's assessment of sensitivity to the presence of weak and strong articles. 
Cases of associative anaphora, or bridging (Clark 1975), constitute anaphoric uses 
of definite descriptions whereby the antecedent is not coreferential, but it refers 
to an item or circumstance which stands in some relation to the referent. The ex- 
ample below shows an anaphoric use of the definite description the ceiling where 
there is no previous mention of a ceiling. However, the existence of a room would 
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entail in common world knowledge the existence of a unique ceiling without ex- 
plicit mention of one. 


(12) Ilooked into the room. The ceiling was very high. (Clark 1975: 171) 


Schwarz finds that cases of bridging in Standard German generally have no 
preference for either the weak or strong article. However, there are certain sub- 
cases of bridging that do demonstrate preferences, depending on the kind of re- 
lationship that is exhibited between the definite description and its antecedent. 
Weak articles seem preferred when the relationship between the definite descrip- 
tion and the antecedent is that of a part-whole relationship, in which the referents 
of both expressions relate to each other as though one were an appendage of the 
other. This is demonstrable through the example of a fridge and its crisper. In the 
example below, the nominal Gemüsefach ‘crisper’ prefers co-occurrence with the 
weak article. 


(13) German (Schwarz 2009: 52) 
Der Kühlschrank war so groß, dass der Kürbis problemlos 
the fridge wassobig that the pumpkin without.problem 
im /#in dem | Gemuüsefach untergebracht werden konnte. 
in.theweak / in thestrong crisper stowed be could 
‘The fridge was so big that the pumpkin could easily be stowed in the 
crisper. 


Strong articles seem to be preferred when the relationship is something other 
than a part-whole relationship, such as a relation in which the antecedent refers 
to a producer of the referent of the definite description. This is demonstrable 
through the example of a play and its author. In the example below, the nominal 
Autor ‘author’ prefers co-occurrence with the strong article. 


(14) German (Schwarz 2009: 53) 
Das Theaterstück missfiel dem Kritiker so sehr, dass er in seiner 


the play displeased the critic so much that he in his 
Besprechung kein gutes Haar #am /andem ` Autor ließ. 
review no goodhair on.theyeak / on thestrong author left 


"Ihe play displeased the critic so much that he tore the author to pieces in 
his review: 


It is interesting why these scenarios should display such preferences that are 
dependent on the kind of relationship established between the definite descrip- 
tion and its antecedent. To answer for the case of the part-whole relationship, 


47 


Carlos Cisneros 


Schwarz (2009) suggests that the preference of definiteness marking associated 
with uniqueness derives from an analysis of part-whole relationships as express- 
ing decomposable situations that entail unique parts. Given some situation, such 
as one in which there is a car, one can reasonably assume the existence of unique 
parts according to common knowledge, such as a car's license plate. Product- 
producer relationships then differ from part-whole relationships because of how 
detachable a producer can be from a product across possible situations, requiring 
some additional mechanism for the construction of bridging. 


2.4 Cross-linguistic variation 


Beyond the German data, Schwarz (2013) finds that many other languages which 
display similar internal variation in strategies of definiteness marking also asso- 
ciate these strategies with either uniqueness or familiarity readings. In a brief 
cross-linguistic survey of how the two notions of definiteness are expressed 
across languages, he shows that not only do Lakhota and Hausa feature two 
distinct types of definite article, they also display a parallel phenomenon to Ger- 
man in associating these articles with either uniqueness or familiarity readings. 
Schwarz also shows that another common strategy for the expression of definite- 
ness across languages is to utilize bare forms of nominal expressions. Languages 
like Akan and Mauritian Creole widely feature bare nominals as definite descrip- 
tions in their grammars. Schwarz further notes that the interpretation of these 
definite bare nominals tends to be only that of uniqueness, parallel to the in- 
terpretation of weak articles in standard German. The example from Mauritian 
Creole below shows two bare nominals later ‘earth’ and soley ‘sun’ serving as 
definite descriptions, denoting two individuals that are uniquely characterized 
by their descriptions in a global domain. 


(15) Mauritian Creole (Wespel 2008: 150; source: O.M.49) 
Later turn otur soley. 
Earth revolve around sun 


“The Earth moves around the Sun. 


In order to express familiarity, the same languages will employ overt modi- 
fication on nominals, sometimes in the form of definite articles specifically re- 
served for familiarity uses. The parallels observed in the data from these lan- 
guages and Standard German are even encountered in cases of bridging, where a 
grammatical sensitivity to part-whole and product-producer relationships is dis- 
played. Part-whole relationships favor uniqueness-expressing, bare nominals as 
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definite descriptions, while product producer relationships favor overt definite- 
ness marking. Jenks (2015) confirms Schwarz's observations on languages with 
definite bare nominals by presenting data on Thai, in which bare nominals in- 
deed may have uniqueness readings, while familiarity is expressed through the 
use of demonstratives. He further claims that the findings are replicable for sev- 
eral other numeral classifier languages. 

In the rest of this paper, it is shown that Cuevas Mixtec mostly patterns with 
Akan and Mauritian Creole in employing bare nominals for the expression of 
uniqueness, while familiarity is expressed using a series of definite articles which 
encode noun class. However, this generalization only serves for the distinction 
between uniqueness and familiarity for a large subset of the nominal inventory 
of the language. There are some cases of nominals for which bare forms are 
more restricted in their distribution, forcing overt definite articles to also take 
on uniqueness interpretations. This alternative pattern more closely resembles 
the strategy of definiteness marking in English, where there is a single definite 
article for the expression of both uniqueness and familiarity. The choice of which 
nominals are selected for either strategy appears to be systematic, as nominals 
displaying the English strategy tend to be predicates of humanity or personhood. 
Ultimately, the paper shows that languages like Cuevas Mixtec can display inter- 
nal variation in the strategy for definiteness marking, with input from the lexical 
semantics of nominal predicates. 


3 Background on Cuevas Mixtec 


This section presents some historical and linguistic background on the language 
of interest for this paper, Cuevas Mixtec. It first very briefly introduces the Mixtec 
family of languages in a historical context. It then introduces some phonological 
details, along with the working orthography for Cuevas Mixtec in which the data 
are written up. Finally, a brief sketch of some word order patterns observed in 
Cuevas Mixtec is presented. Although the purpose of this paper is not to flesh 
out the phrase structures of the language, some familiarity with basic sentence 
structure is helpful for interpreting the data on definiteness expressions later. 


3.1 Mixtec language family and Cuevas Mixtec 


Mixtec is a family of languages which are indigenous to the Mixteca region of 
southern Mexico. Mixtec speakers are encountered in villages and cities through- 
out the Mixteca, which encompasses much of the western half of Oaxaca state 
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and includes parts of neighboring Guerrero and Puebla states, an area altogether 
covering roughly 10,000 square miles (Bradley & Hollenbach 1988). In 1988, 
the Mixtec speaker population was almost 250,000 people, although this popula- 
tion had grown to 477,995 people according to the Mexican census for the year 
2010 (INEGI 2010). The language family has been described as being composed of 
about 20 mutually unintelligible languages and their variants? (Bradley & Hol- 
lenbach 1988). Each village features its own variant of Mixtec, with phonological, 
syntactic, and lexical idiosyncracies. Mutual intelligibility between variants is 
often restricted to villages in close geographic proximity, and often enough two 
villages that speak different Mixtec languages are near each other. There is no 
widespread or standard variety of Mixtec, although the variants have been able 
to be categorized into groups according to mutual intelligibility (Egland 1978: 25- 
37) and historical sound changes (Josserand 1983). For these reasons, grammatical 
descriptions of Mixtec languages highlight the village of origin for the variant of 
Mixtec described, as this paper does. 

The Mixtec language family belongs to the greater Otomanguean language 
stock distributed throughout central and southern Mexico today (Rensch 1976). 
Features common to all Otomangueann languages include isolating morphology 
and significant representation of morphemic suprasegmental features, such as 
tone and voice quality. Because of the high presence of these features in Otoman- 
guean, languages like Mixtec have been subject to a wealth of phonetic and 
phonological research. In contrast, research into Mixtec for the sake of syntac- 
tic (or semantic) description is much less abundant (Bradley & Hollenbach 1988). 
Within Otomanguean, Mixtec is further grouped with Triqui and Cuicatec into 
the Mixtecan language family, spread throughout the western half of Oaxaca, 
eastern Guerrero, and southern Puebla. 

Although Mixtec speakers are often thought of as a single ethnic group by out- 
siders, Mixtecs themselves tend not to identify with each other in such a manner. 
The terrific linguistic diversity found within the Mixtec language family is reflec- 
tive of an old culture of village-based ethnocentricity. Mixtec speakers in Mexico 
often identify with their home village as a source for ethnic identity (Spores & 
Balkansky 2013: 221-223). They much less identify with a broader Mixtec soci- 
olinguistic heritage, and this is apparent in the history of resource competition 
and intercommunity conflict in the Mixteca region, recorded since before the 


"The term variant here is a common substitute for dialect in discussion about languages of 
Mexico. The term dialect has certain political and derogatory connotations in the Mexican 
and Latin American context that are preferably avoided. The terms variant and variety replace 
dialect in order to disambiguate reference to the high degree of mutual intelligibility that one 
speech community has with another. 
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Spanish Colonial period. This trend would change somewhat for the thousands of 
Mixtec temporary laborers and immigrants who had moved to northern Mexico 
and the United States towards the end of the twentieth century. Mixtecs among 
the United States diaspora have accommodated broader ethnic affiliations with 
other Mixtecs, and even other Mexicans of Oaxacan origin, as a response to their 
alienating circumstances as migrant workers (Velasco Ortiz 2005; Spores & Balka- 
nsky 2013: 228-235). Many have organized and formed interest groups around 
issues pertaining to the plight of the broader Oaxacan migrant community in 
the United States and Northern Mexico. However, despite these new develop- 
ments, Mixtec migrants retain strong hometown or village affiliations, and this 
phenomenon has gone hand in hand with Mixtec dialectal diversity for at least 
several centuries. 


mee 
Xochihuehuetlán 


O linala 


p | Camazulapan 
X ucuquimi de Ocampo 


Klapa de Conionfort 
: i Santa María Natividad ‚Teposcolula 


Silacayoapan "Asunción Nochistlán 


San Jerónimo Progreso 
Martín P 
San Marin EX Santiago Justlahuaca 
Coicoyán de las Flores 2 EES 
+ Es ySan Miguel Cuevas 
Maalinaltepec 


San Juan Pina? C : Non 
Metlatöno& J i a J uay egpala 


flicayan de Tovar 


Pütla yChalcatongo de Hidalgo 


Figure 1: San Miguel Cuevas in northwest Oaxaca (personal elabora- 
tion) 


This paper concentrates on data from Cuevas Mixtec, the particular variant 
of Mixtec spoken in the village of San Miguel Cuevas, or fiüüx nüüx yuki? ‘the 
village on the mountain' as it is named in the variant. This village is located in 
the municipality (or municipio) of Santiago Juxtlahuaca, southwest of the munic- 


>The stars here are introduced later as marking the presence of floating tones. 
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ipal seat (Figure 1). This location might put Cuevas Mixtec in Josserand's (1983) 
classification as a variety of Southern Lowlands Mixtec. According to the 2010 
Mexican Census, this village had a population of 522 inhabitants (242 male and 
280 female) with 441 citizens over the age of three years of age that spoke an 
indigenous language (233 male and 208 female) (INEGI 2010). The local name 
for the variant of Mixtec spoken here is tü’üun ndá' ví ‘poor language’, although 
there is a movement to replace this manner of referring to the language with 
tu’ün savi ‘rain language’ or ‘Dzahui’s language". Due to early twentieth cen- 
tury educational policy against the retention of indigenous languages in Mexico, 
few Mixtec speakers are trained in written forms of their languages (Velasco Or- 
tiz 2005: 29), and San Miguel Cuevas has yet to see a standardized, written form 
of theirs. 

Beyond San Miguel Cuevas, speakers of this variant are also found in the 
United States, having immigrated to take on jobs in the service industry, manu- 
facturing, and agriculture. Most of these speakers are immigrants born in Mexico, 
and they are located in Delaware, the Portland metropolitan area of Oregon, and 
Fresno county in California. In Fresno, the speaker population is absorbed into 
the greater Mixtec or Oaxacan community, which also has significant numbers 
of Mixtecs from Yucuquimi de Ocampo and Santiago Tilantongo in Oaxaca, and 
Metlatónoc and Jicayán de Tovar in Guerrero. The variants of Mixtec spoken by 
members of different towns may differ to the extent that Spanish is preferred as 
a means of communication, and Cuevas Mixtec is therefore not widely spoken 
outside the home. Within the home, Cuevas Mixtec is spoken more frequently to 
varying degrees. Some local radio stations have accommodated some program- 
ming in several local varieties of Mixtec at special times, though I am not certain 
that they have had programming in Cuevas Mixtec. Local rap artist Miguel "Una 
Isu” Villegas has incorporated Cuevas Mixtec into the lyrics of several of his 
songs, and these songs are available on several media-sharing websites. 


* Dzahui is the name of the Mesoamerican rain deity that appears in Mixtec codices and ancient 
stone carvings. The movement to rename all Mixtec languages as local translations of ‘rain lan- 
guage' or 'Dzahui's language' has spread into much of the Mixteca region besides San Miguel 
Cuevas, though I have not been able to trace its origin or motivation. Some motivation may 
come from the fact that the veneration of rain deities is a practice of the native Mixtec religion 
which has survived the imposition of Catholicism in the colonial era. In San Miguel Cuevas, 
there is a special stone named Saint Michael which is provided offerings in exchange for the 
prospect of rain. I have also been told about a stone of similar purpose in Ixpantepec Nieves 
which has retained the name 'rain' or Dzahui. 
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3.2 Orthography of Cuevas Mixtec 


For each example sentence of Mixtec throughout this paper, I present the data 
with three transcription tiers: transcription, morpheme gloss, and translation. 
Transcriptions are written using a variant of the official Mixtec orthography en- 
dorsed by the Academy of the Mixtec Language (Academia de la Lengua Mixteca 
or Ve'e Tu'un Savi), instead of phonetic symbols from the International Phonetic 
Alphabet (IPA). This is intended to facilitate reading for Mixtec scholars and other 
readers familiar with this orthography. Table 1 presents the current alphabet for 
Cuevas Mixtec, with corresponding IPA symbols for each entry. There are five 
oral vowels a e io u, voiceless affricate and plosives ch k ku p t ty? voiced plosive 
dP prenasalized stops mb nd ndy ng, nasal stops m n fi, voiceless fricatives s sy 
x,’ voiced fricatives v y, and liquids l r There is additionally a glottal stop of 
ambiguous phonemic status, and this is written with an apostrophe. 


Table 1: Cuevas Mixtec orthography and phone correspondences 


d b ch d e i 
lal lb Al IV J/el Di 


j ju k ku l m 
/x/.— Ji JK ni JH /m/ 


mb n nd ndy ng fi 
Mp) m Py et. Al A 

o p r s sy t 
lo] pl Il  /s Id M 


ty u v x y 


Otomanguean languages are well known for the preponderance of supraseg- 
mental features, and Cuevas Mixtec is no exception. Nasalized vowels are repre- 


The reader may notice that the letter u is used for representing both a vowel and a secondary 
feature of two consonants ku and ju. In the data, tone marking always occurs on vowel symbols, 
including those for /u/. This distinguishes the occurrence u as a vowel from its occurrences in 
consonant digraphs, which do not take tone marking. 

*The voiced plosive d seems to only occur on one pronoun and is likely an allophone of the 
voiceless plosive t. 

"The plosive b and the fricatives j ju occur in loanwords and proper names from Spanish. 

"The tap r seems to only occur in some pronouns and may be an allophone of ty. 
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sented with an adjacent n as in an en in un. Cuevas Mixtec is also a highly tonal 
language, with three level tones that combine into nine possible contours. High 
tones, low tones, and mid tones are marked with acute accent á, grave accent d, 
and macron d, respectively. With the help of more recent work on similar vari- 
eties of Mixtec (Carroll 2015), I have also been able to identify the presence of 
floating tones.? I have not thoroughly investigated the distribution of floating 
tones, so they are inconsistently marked in the data. Where they are marked, the 
pending convention I have chosen is to represent their presence with a star x. 

Many pronouns in the language have cliticized forms. For clitics, I stray from 
the Academy of the Mixtec Language and follow a convention observed in 
Bradley & Hollenbach's (1988) Studies in the syntax of Mixtecan languages se- 
ries by representing clitics as orthographically detached from host words. The 
detachment is represented by horizontal space in the data, and therefore, no spe- 
cial marking for clitics is used.” Finally, the data itself throughout this paper is 
marked for acceptability as a phrase or sentence of the language, unacceptabil- 
ity, infelicity, and spontaneous elicitation. Acceptable phrases and sentences are 
marked with a checkmark V, semantically or grammatically anomalous phrases 
are marked with an asterisk *, and infelicitous sentences are marked with a pound 
sign #. Spontaneously elicited phrases and sentences, or those which were pro- 
duced by a speaker in speech or translation, are unmarked. 


3.3 Word order patterns of Cuevas Mixtec 


This subsection covers basic word order patterns encountered in the language in 
order to facilitate reading of definiteness data later. Basic sentence structure is 
presented first, and Cuevas Mixtec is shown to be a VSO language with certain 
conditions for optional or obligatory repositioning of verb arguments to a pre- 
verbal position. Some aspects of the structure of the noun phrase are presented 
afterwards in order to demonstrate the distribution of definite articles with re- 
spect to other modifiers later. The subsection then presents examples of the dis- 
tribution of noun classifiers, which have occurrences as the definite articles of 
the language. 


?Floating tones are applied to the first vowel of the following word, and their value depends on 
the tone value of the last vowel of the word they originate from. They manifest as high tones 
when the tone of the last vowel is low and as low tones otherwise. Therefore, the floating tone 
from nüüx ‘face’ will be high, while the floating tone from chitüx ‘cat’ will be low. 

The result of this convention is the lack of representation of data where clitics combine with 
truncated forms of host words. Truncation often occurs on long vowels or [V?V] strings after 
a clitic without a consonant is attached. 
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3.3.1 Basic sentence structure 


Cuevas Mixtec is a verb-initial language, like many languages of the Mesoamer- 
ican area. The subject argument of a verb consistently follows the verb if it is a 
clitic pronoun. The object argument follows the subject in transitive sentences. 


(16) v kixi rà 
sleep.IPFV 3sG.M 


‘He is sleeping’ 


(17) V kuni rí tyikuif 
want.IPFV 3SG.AML water 


‘It wants water. 


Acceptable subject placement varies when the subject argument is not a clitic 
but a full nominal. SVO word order is often the preferred word order for sen- 
tences with non-clitic subject arguments uttered without a discourse context. 
VSO word order in this case is often strange without a discourse context pre- 
sented beforehand. 


(18) Context: Ø 
a. v [tyà Juáàn] isyiin [iin kárró] 
the.sc.m Juan buy.comPL one car 
‘Juan bought a car? 
b. # isyun [tyà Juáàn] [un kárró] 
buy.comPL the.sc.M Juan one car 


(Juan bought a car?) 


(19) Context: Ø 
a. [ndyiiva na] isyii 
all FOC 3.HUM die.COMPL 


'Everyone died: 


"Example (19) features a focus-sensitive particle và which occurs in many other examples 
throughout the data. It serves many roles such as emphasizer, restrictive/exclusive particle, 
and aspectual particle, similar to English just. Its role in (19) is uncertain, though speakers note 
that it is optional in this case. 
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b. # isyi’i [ndyii va nal 
die.coMPL all FOC 3.HUM 


(‘Everyone died’) 


VSO word order becomes preferred with the addition of adverbial modifiers. 
Temporal adverbs like ikü ‘yesterday’ allow for non-clitic subjects to occur 
postverbally without a discourse context. 


(20) Context: Ø 
v iku isyiin [tyà Juáàn] [un kárró] 
yesterday buy.coMPL the.sc.MJuan one car 


"Yesterday, Juan bought a car. 


Wh-questions also confirm the basic word order to be VSO. Cuevas Mixtec 
features obligatorily preposed wh-words in wh-questions. Even if the wh-word 
is not a verbal argument, verbal arguments are unable to occur between the verb 


and the wh-word. 


(21) a. v ndyíí isyun [tyà juáàn] [un kárró] 
where buy.compL the.sc.M Juan one car 
"Where did Juan buy a car?' 
b. *ndyíí [tyà juáàn] isyiin [un kárró] 
where the.sc.M Juan buy.coMPL one car 


(‘Where did Juan buy a car?) 


There are some instances of a clitic pronoun co-occurring with a preverbal 
nominal as a coreferential item, similar to a resumptive pronoun or overt trace. 


(22) [tyà juáàn] isyiin rà [in kárró] 
the.sc.m Juan buy.coMPL 3sG.M one car 


‘Juan bought a car! 


This seems to indicate a sort of topicalization strategy, where the preverbal 
nominal occurs in a topic position while the pronoun serves as the true verb ar- 
gument. There are three reasons for suggesting this proposal. First, conjunction 
of sentences shows that preverbal subject arguments are restricted in their dis- 
tribution when these resumptive pronouns occur. A preverbal subject argument 
cannot occur for each conjunct sentence when the resumptive pronoun occursin 
each. If each conjunct sentence has a resumptive pronoun, the preverbal subject 
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argument occurs once for the entire utterance, taking scope above the conjunc- 
tion itself. Preverbal subjects may occur within each conjunct sentence as long 
as there is no resumptive pronoun present. 


(23) a. *[ndyr!va na] syita nā tya [ndyriva na] 
all FOC 3.HUM sing.IPFV 3.HUM and all FOC 3.HUM 
syitasya’a na 
dance.IPFV 3.HUM 
(‘Everyone sings, and everyone dances.) 
b. V [ndytiva na] [syita nā  tyà syitäsya’a na] 
all FOC 3.HUM sing.IPFV 3.HUM and dance.IPFV 3.HUM 
‘As for everyone, they sing and dance? 
c. V [ndyr iva na] syita tya [ndyr iva na] syitäsya’a 
all FOC 3.HUM sing.iPFV and all FOC 3.HUM dance.IPFV 


'Everyone sings and dances? 


Secondly, there are certain types of modified nominals which would be barred 
from serving as topics as they are non-referential, such as negated nominals. If 
a negated nominal occurs in the preverbal position, and it is interpreted as the 
sentence subject, a clitic subject cannot occur in the subject position after the 


verb(s). 
(24) [ni ün nà tyaa] kuni (na)  küsü (^na) 
not.even one the.HUM man want.IPFV 3.HUM sleep.IRR 3.HUM 


‘No men wanted to sleep: 


(25) [cháá nà tyàa] syita (*nà) 
less the HUM man sing.IPFV 3.HUM 


'Few men are singing: 


Thirdly, preverbal nominals are crucial for the expression of generic state- 
ments. Postverbal subject arguments force a progressive aspectual interpreta- 
tion of the sentence below, while preverbal subject arguments allow for a topic- 
comment reading of the same material. It is not crucial that the resumptive pro- 
noun occur for triggering the topic-comment reading. 


(26) a. syei [tyí chittix] tyiin« 
eat.IPFV the.AML cat mouse 


"Ihe cat is eating mice: 
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b. [tyi chítü»] syéi (ri) tyiinx 
the.AML cat eat.IPFV 3SG.AML mouse 


“The cat eats mice? 


Without the resumptive pronoun, the sentence is interpreted as being an an- 
swer to a question. This might suggest that preverbal arguments without co- 
occurring resumptive pronouns are focalized. 


3.3.2 Basic noun phrase structure 


Nouns in Cuevas Mixtec do not require modification in order to occur as verb 
arguments. They frequently occur in bare forms and are often interpreted as in- 
definites in such cases. 


(27) chítüx syéí ri tyiinx 
cat eat.IPFV 3SG.AML mouse 


‘Cats eat mice? 


Different classes of nominal modifiers occur before or after the nominal. There 
are at least four classes of items which may occur prenominally: quantifiers, nu- 
merals, definite articles, and a specifier. The specifier mii serves as a reflexive 
when modifying a pronoun, as in the case of mii ra ‘himself’. 


(28) [tya Juaan] káni ra [na  küsü mii ral 
the.sc.M Juan want.IPFV 3sG.M COMP sleep.IRR SPEC 3SG.M 


‘Juan wants that he himself sleep’ 


While modifying a nominal, the function of the specifier seems to be that of 
encoding focus, or the presentation of the modified nominal as new information. 


(29) [mii tyà jJuáàn] sátátá 
SPEC the.sc.M Juan healiPrv 


‘Juan is the one healing others. 


(30) V [mií tyina] ndé' 
SPEC dog bark.ıprv 
‘It is the dog barking’ 


Quantifiers occur in a prenominal position. The examples below include the 
quantifiers ndyri'i ‘all’ and cháá ‘few’. 
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(31) [ndyr'i tyina] nde’r 
all dog  cryiPFV 
"All dogs bark: 


(32) [cháá tyina] kumi nda’a 
few dog have.ıprv hand 


‘Few dogs have hands: 


(33) [ndyikuií nii] và'a a 
all salt good 3.1NA 
‘All salt is good: 


Quantifiers do not seem to co-occur with the specifier. This might suggest that 
both quantifiers and the specifier form a grammatical class. 


(34) * [mií ndyi' tyina] ndé't 
SPEC all dog cry.IPFV 
(It is all dogs that bark’) 


Numerals also occur prenominally, though they differ from quantifiers in be- 
ing able to co-occur with the specifier. The examples below feature the numeral 
ü'ün ‘five’. 

(35)  isyini ra Toon fiándyri] 
see.COMPL3sG.M five sun 


‘He saw five suns.’ 


(36) V [mii oun tyina] ndé'i 
SPEC five dog cry.IPFV 


‘It is five dogs that are barking’ 


Quantifiers differ amongst themselves in their capacity to co-occur with nu- 
merals. The quantifier ndyi'i ‘all’ seems to be able to co-occur with numerals, but 
säva ‘half’ cannot. 


(37) ndyi'i kümi tyina yó'o 
all four dog here 
'the four dogs here' 
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(38) *[sava üsyi tyina] nde’r 
half ten dog cry.IPFV 
(‘Half of the ten dogs are barking’) 


The quantifier ndyr' has the ability to syllabically reduce in cases where nu- 
merals co-occur, while reduction is not possible before a bare noun. This suggests 
that the item is not identical to other instances of the quantifier ndyi’i. 


(39) ndyi kumi tyina yó'o 
all four dog here 
'the four dogs here' 


(40) [ndyr und tyaa] isyi’i 
all eight man die.comPL 
"Ihe eight men died? 


(41) [ndyi uvi tyina] nde’r 
all twodog cry.IpFV 
"Ihe two dogs bark: 


(42) * ndyi isi 
all deer 
(‘all deer’) 


A large number of items may occur postnominally, including demonstratives 
and relative clauses. The following example features a demonstrative kad ‘over 
there’, which follows the nominal within the noun phrase. 


(43) [tyà Juaan] isya’an rā [nuux kaa] 
thesc.M Juan go.coMPL 3sG.M village over.there 


‘Juan went to the village over there! 


3.3.3 Noun classifiers and their functions 


Cuevas Mixtec features a robust grammatical gender system which is exhibited 
through both its pronoun and noun classifier inventories. Noun classifiers in 
Cuevas Mixtec are semi-pronominal items which explicate, and are sensitive to, 
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the underlying system of grammatical gender in the language. They are semi- 
pronominal because, unlike pronouns, they are typically not interchangeable 
with nominals. They exhibit several grammatical functions in the grammar of the 
language, including at least their uses as definite articles and relative pronouns, 
as is shown in this subsection. Table 2 provides the inventory of noun classifiers 
in the language. They often phonotactically resemble their cliticized pronomi- 
nal counterparts, but not in all cases. These noun classifiers are not unique to 
Cuevas Mixtec among Mixtec languages, and one may find their analogues across 
the family. They are called prestressed pronouns in Bradley & Hollenbach's (1988) 
Studies in the syntax of Mixtecan languages series, where they are described for 
several very different varieties of Mixtec. The Mixtec languages differ widely in 
the exact inventory of genders that are recognized grammatically. Macri (1983) 
observes the gender systems of six different Mixtec varieties. All of these vari- 
eties had masculine, feminine, and animal genders, though they differed in rec- 
ognizing inanimate, youth, liquid, and sacred genders. 


Table 2: Classifiers vs. pronouns 


Gender Classifier Pronoun 


M tyà rà 

x: Äá pe 
YTH tā syl 
HUM na na 
AML tyi ri 
STR tü du 
LIQ ndrä r 
INA na na 


The grammatical function of these classifiers that is of primary interest for this 
paper is their occurrences as definite articles, although their uses expand beyond 
these cases. When occurring in the prenominal position, these items contribute 
a meaning of a familiar individual which satisfies the nominal description. Since 
they encode gender, they show agreement constraints which bar a noun classifier 
from modifying a noun with a conflicting inherent gender. 


(44) tyà tyaa 
the.sG.M man 


‘the man’ 
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(45) *ndrá tyaa 
the.Lıo man 


(‘the man’) 


Among the prenominal modifiers, definite articles are the most adjacent to the 
noun. They seem to be able to co-occur with all other prenominal items. While co- 
occurring with quantifiers or numerals, they explicate the formation of partitive 
constructions. 


(46) [sava tyí tyina] ndé't 
half the.amı dog — cry.IPFV 
‘Half of the dogs bark/are barking’ 


(47) [üvi nà tyaa] kuanu’u 
two the.HuM man go.home.IPFV 


"Two of the men went home’ 


(48) [üvi tyí tyina] nde’r 
two the.AML dog  cry.PFV 
"Two of the dogs bark: 


They even co-occur with whatever combinations of quantifier and numeral 
are possible in the language. 


(49) [ndyr oui na tyaa] isyi'i 
all two tbe gw man die.coMPL 
“The two of the men died? 


(50) [ndyràvi tyí tyina] ndé' 
all twothe.amı dog cry.IPFV 
"Ihe two of the dogs bark’ 


Noun classifiers prescriptively occur with proper names to denote individuals, 
although they may be dropped from names in very casual speech. Proper names 
without classifiers also refer to names themselves. 


(51) [(tyd) Kornélié] kon [tyà káa] 
the.sc.M Cornelio be.ıprv the.sc.M over.there 


"Ihat guy over there is Cornelio? 
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(52) [tya yó'o] nani rā | Kornélíó 
the.sc.M here be.called.ıprv 3sc.M Cornelio 


“This guy is called Cornelio: 


In addition to nouns, noun classifiers also modify adjectives, functioning as 
nominalizers while encoding definiteness. 


(53) fd wë 
the.INA here 


‘this one here’ 


(54) na kuii 
the.INA green 


‘the green one’ 


This strategy of nominalization extends to verb phrases, forming what appear 
to be light-headed relative clauses. 


(55) na isya’a [ra Maria] [nààx ra] 
the.INA give.coMPL the.F Maria face 3sG.M 


“the one that Maria gave to him’ 


They also serve as relative pronouns in the sense that they introduce a relative 
clause which bears a full nominal head. Agreement in gender between the rela- 
tive pronoun and the relative clause head remains just as important as between 
nominals and definite articles. 


(56) tutu na isya’a [ia Mariä] [nuux ra] 
book Oe ma give.coMPL the.F Maria face 3sG.M 
‘the book that Maria gave to him’ 


(57) [tyà Juaan] isyrt rā  [tyikuiindra ix va’a] 
the.sc.m Juan drink.comPpL 3sG.M water the.LIQ NEG good.IPFV 


‘Juan drank the water that was not good’ 


Full nominal heads in relative clause structures may themselves take on def- 
inite articles while a relative pronoun occurs at the same time. This shows that 
the two usages of noun classifiers as either definite articles and relative pronouns 
are grammatically distinct. 
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(58) [na tutu] na isya’a [iad Mariä] [nàà» ra] 
the.ına book the.InA give.coMPL the.F Maria face 3sG.M 
‘the book that Maria gave to him’ 


Lastly, relative clauses may even occur on proper names to serve as appositive 
relative clauses, as in the example below. 


(59) [tyà Juaan tyà kátóó | ka'vi]  kuá'à và'à ka’vi rà 
the.sc.M Juan the.sc.M like.IpFv read mp much good read.1Prv 3sG.M 


‘Juan, who likes reading, reads a lot. 


4 Regular nominals and definiteness encoding 


This section presents the semantic evidence for one of two claims. Cuevas Mix- 
tec very much patterns with other languages displaying multiple strategies for 
encoding definiteness by associating those strategies with different notions of 
definiteness. For many nominal items of the language, definite bare nominals are 
interpreted as unique with respect to some domain or situation, while overt defi- 
nite articles contribute an anaphoric element to the interpretation of the nominal. 
This is shown by observing patterns in the choice of definiteness marking strat- 
egy within the semantic environments of both immediate and larger situation 
uses, anaphoric uses, and bridging. Thus, Cuevas Mixtec displays the correspon- 
dence of bare form with uniqueness interpretation, and overt marking with famil- 
iarity interpretation, that has been noted for other languages by Schwarz (2013) 
and Jenks (2015). Most nominals of the language pattern this way, encompassing 
predicates without clear semantic associations among themselves, such as crea- 
tures and buildings. For this reason, it is assumed that these nominals represent 
a default in definiteness encoding, owing them the label of regular nominals. 


4.1 Uniqueness with regular nominals 


Regular nominals in their bare forms may be interpreted as uniqueness definites, 
and this is evidenced by the use of bare forms for various non-anaphoric purposes 
explained by Hawkins (1978). Bare forms are the natural form of regular nominals 
for the expression of larger situation definiteness, meaning that they are able to 
encode definiteness as characterized by reference to an entity uniquely identified 
within general world knowledge. The word yoo ‘moon’ refers to a entity uniquely 
identified as a moon in most real world interactions, and it displays resistance to 
modification by a noun classifier. 
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(60) V [tyà juáàn] ndé'é rā (#ñà) yoo 
the.sc.M Juan look.ıprv 3s6.M the.INA moon 


‘Juan is looking at the face of the moon: 


Bare nouns are also used for immediate situation definiteness. They encode 
reference to an entity whose description is unique with respect to contextual 
knowledge that is shared between interlocutors. This is similar to larger situation 
definiteness in that uniqueness is anchored to a domain of shared knowledge, 
but it differs in that this domain is quite small, non-global, or situational. Any 
particular dog is not unique on a global scale, but dogs can be unique relative to 
their owners, as in the case of a family dog. Thus, the word tyind ‘dog’ rejects 
modification by a noun classifier in the following example. 


(61) Context: A family’s dog has gone missing for a week. A relative enters 
their house one day to find them cheerful and then proceeds to ask why 
they are suddenly happy. 
indyikókoo (#tyi) tyina 
return.coMPL the.AML dog 


‘The dog came home!’ 


The results are fairly replicable for many examples of localized uniqueness. 
Churches are often unique to many villages in the Mixtec region of Mexico. The 
word vefiü'ü ‘church’ may not take a definite article assuming the context pro- 
vided below. 


(62) Context: A man is visiting a Mixtec village, many of which have one 
church. 
isyini i (#ñà) vefü'ü 
see.COMPL1sG the.INA church 


‘I saw the church: 


There is another usage not identified by Hawkins, though it is observed in 
more recent studies of definiteness. So-called weak definites (Carlson 2006) are ac- 
tually neither unique nor anaphoric. They are nominals which appear to take on 
definiteness marking without referring to specific individuals. Below, the weak 
definite the hospital seems to refer more to the situation of being in a hospital 
rather than being in a particular one. 


(63) Every accident victim was taken to the hospital. (John to Mercy Hospital, 
Bill to Pennsylvania Hospital, and Sue to HUP) (Schwarz 2014: 3) 
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I have been able to identify at least one word, ya’vi ‘market’, which constitutes 
a case of a weak definite. In the example below, the word takes on a bare form, 
without definite articles. 


(64) Context: Five people leave a room and return, having each gone 
shopping at a different market. 
V [ndyı’ıva na]  isyaàn ya’vı 
all FOC 3.HUM go.cOMPL market 


‘All of them went to the market: 


A final note concerning the encoding of uniqueness for regular nominals is 
that these nominals may not necessarily occur in bare forms for a uniqueness 
interpretation. Modified nominals may also have uniqueness interpretations at 
least in the case of partitive constructions. The example below demonstrates that 
a larger situation definite like yoo ‘moon’ may retain its uniqueness interpreta- 
tion while modified by the quantifier säva ‘half’. The resulting partitive construc- 
tion is not interpreted as quantification over a group of moons, but quantification 
over portions of the unique moon with respect to Earth. 


(65) [tyà Juaan] isyint ra  [sava yoo] 
the.sc.M Juan see.IPFV 3sc.M half moon 


‘Juan saw half of the moon? 


This might suggest that uniqueness is interpretable within the complement of 
a quantifier. If that is true, it would entail that uniqueness interpretations of bare 
nominals syntactically correspond to an embeddable phrasal projection of some 
sort, such as a determiner phrase. However, the exact structural relationship be- 
tween quantifiers and nominals in Cuevas Mixtec remains to be explained. 


4.2 Familiarity with regular nominals 


Besides bare forms of nominals, definite descriptions are also formed with overt 
marking by means of the language's definite articles, but the occurrence of def- 
inite articles comes with an alternative set of functions. Definite articles are 
somewhat awkward when occurring in semantic environments that suggest the 
uniqueness of the definite description's referent, as shown previously. Definite 
articles are much more preferred when used to indicate an anaphoric relation- 
ship with an antecedent nominal, corresponding to the interpretation of the def- 
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inite description as familiar. This would follow Schwarz and Jenks's findings 
that languages which feature bare nominals as definite descriptions in addition 
to overt definiteness marking tend to reserve the overt marking for the expres- 
sion of familiarity. In the narrative below, the first sentence presents a charac- 
ter, Juan, who is visiting a library to obtain a book that he is searching for. The 
two follow-up sentences are near identical in form, though they differ in their 
anaphoric properties due to the presence or absence of the definite article na. 
The first follow-up sentence has a bare nominal libri ‘book’ which is interpreted 
existentially under negation. This follow-up then claims that there are no books 
at all at the library. In contrast, the presence of the definite article in the sec- 
ond follow-up allows for continued comment on the book Juan was looking for, 
saying that it was absent from the library without comment on other books. 


(66) isya'àn  [tyà Juaan] bibliotéká taanna mi ra [iin 
go.coMPL the.sc.M Juan library so comp obtain.IRR 3sG.M one 
libri] 
book 


‘Juan went to the library in order that he get a book’ 
a. suukoo libri índaà kaa 
but NEG book located.rPrv over.there 
‘But there was no book there: 
b. v suu koó [na libri] índaà kaa 
but NEG the.INA book located.rPrv over.there 
"But the book was not there’ 


Because only the second follow-up is a continued comment on the book in the 
first sentence, it is the definite article which creates the crucial anaphoric link. 

It is not crucial for the speaker to be familiar with the identity of the individual 
denoted by the definite nominal. The speaker may invoke a definite article for cre- 
ating anaphoric links between coreferential nominals if their referent is learned 
about from hearsay. The narrative below introduces an unspecified turkey that 


Tt is worth noting here that there is a bit of variation in judgment across generational lines 
about the use of definite articles. The data here better reflects younger generational speech, 
which features broader usage of definite articles for creating anaphora. Older speakers seem 
to dislike definite articles on regular nominals, or are at least much pickier about when they are 
used. This might indicate a diachronic shift in the use of definite articles from something other 
than familiarity, which might also coincide with the development of definite articles in Cuevas 
Mixtec. To my knowledge, definite articles are rarely described for other Mixtec variants. 
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can only be referred back to with the occurrence of the definite article tyi. The 
two follow-up sentences are both declarations of hearsay that a turkey was sick, 
though only the second followup sentence is felicitous because of the presence 
of the definite article. 


(67) isya'ní | [tyà juáàn kólo 
kill.compL the.sc.M Juan  male.turkey 
‘Juan killed a turkey: 
a. 4 káchí nā na  kü'wi và kólo 
say.IPFV 3.HUM COMP sick.IPFV roc male.turkey 
"Ihey say that a turkey was sick. 
b. V káchí nā na  kü'wi và [tyí kolo] 
say.IPFV 3.HUM COMP sick.iPFv Foc the.amı male.turkey 


“They say that the turkey was sick. 


Again, the first follow-up sentence is bizarre, but this time because it is inter- 
preted as an assertion about a different turkey. The second follow-up sentence is 
interpreted as being about the same turkey, thanks to the presence of the definite 
article. 

Definite articles may even occur on mass nouns, where their presence similarly 
encourages the formation of an anaphoric link between the definite description 
and a coreferential antecedent. The presence of the definite article allows a nom- 
inal to refer back to a particular collection of mass that was introduced before. 
The following example introduces a patch of salt that is later commented further 
upon as being brown. 


(68) isyika [tya juáàn] nùùx nii 
walk.compL thesc.M Juan on salt 
‘Juan walked on salt. 

a. yaa va [#(ñà) nii] 
brown Foc  the.INA salt 


"Ihe salt was brown: 


An interesting effect is observed with overt definiteness marking on mass 
nouns. The occurrence of the article encourages the interpretation of the nominal 
referent as being unitized in some manner, so as to distinguish a particular body 
of mass. In the following example with introductory and followup sentences, 
the occurrence of the definite article turns out to be optional, but with distinct 
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effects in the interpretation of the referent of tyikuii ‘water’. The lack of the ar- 
ticle forces the interpretation of the nominal’s referent to be a greater collection 
of water that is salient within a situation and which may not altogether be a 
participant in the drinking event. The occurrence of the article encourages an 
interpretation of the nominal’s referent to be a delimited amount of water which 
is participating in the drinking event, such as from a bottle. 


(69) [tyà Juaan] isyi’t tyikuif 

the.sc.M Juan drink.compL water 

‘Juan drank water (of unspecified source). 

a. iix va’a tyikuii 
NEG good.IPFV water 
"Ihe water (from the river or lake) was not good. 

b. iix va’a [ndrá tyikuii] 
NEG good.IPFV the.LIQ water 


"Ihe water (from a bottle) was not good. 


This seems to be in line with the findings on definiteness in Cuevas Mixtec 
for count nouns. The first followup sentence appears to represent a case of im- 
mediate situation definiteness, whereby the bare nominal indicates a unique in- 
dividual relative to a situation. The bare nominal must then refer to the maximal 
amount of water given a situation, which is not identical to the amount of water 
that Juan drank. The article allows the nominal to refer back to the unitized water 
that Juan had drank and can be further commented on. 


4.3 Bridging 


The last usage of definite descriptions to be addressed in this paper are cases of 
bridging. For many examples of bridging, both bare nominals and nominals with 
definite articles may serve as anaphora for a non-coreferential antecedent, but 
there are also some cases which demonstrate a clear preference for one strat- 
egy of definiteness encoding over the other. It turns out that these special cases 
include those relationships between definite descriptions and their antecedents 
that were first outlined by Schwarz (2009), and Cuevas Mixtec patterns with 
other languages by invoking its strategies for marking uniqueness or definiteness 
for the same cases.’ In this language, when there is a part-whole relationship 


BSchwarz has reported on encountering some variation among speakers’ intuitions regarding 
bridging examples, and I have found similar variation for these examples in Cuevas Mixtec. 
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between the definite description and the antecedent, the definiteness marking 
strategy of choice is the bare nominal, which indicates uniqueness. The following 
example has the definite description tu yé’é ‘the door’ in a part-whole relation- 
ship with the indefinite nominal iin ve’e ‘a house’. The occurrence of the definite 
article tu is dispreferred. 


(70) isyüun [tyà juáàn] [un ve'e] 
buy.comPL the.sc.M Juan one house 


‘Juan bought a house! 


7397 


a. syda — tá'vi và [(#tú) yee] 
already broken.iPFv Foc ` the.srR door 


"Ihe door was broken: 


Also in line with observations from other languages, since the antecedent is 
not coreferential with the definite description, it need not even be an individual. 
The antecedent can also be a situation that is introduced and which may naturally 
entail conditions such as the existence of certain kinds of entities. The example 
below has a definite description kárró ‘car’ with an antecedent adverbial phrase 
tá'an sakaka which introduces a situation of driving. Note that the inclusion of 
the definite article sounds awkward to speakers in this case. 


(71) Context: Juan has a strange hearing problem which causes him to go deaf 
or have selective hearing in special circumstances. 
[tá 'àn sákaka tya juáàn]ii  küvitasóo ra 
every.time drive.IPFV the.sc.M Juan NEG can hear.IPFV 3SG.M 
[(#tú) karro] 
the.sTR car 


‘Every time Juan drives, he cannot hear the car! 


The scenario presented in the adverbial phrase entails the existence of some 
vehicle to be driven within the event. The car is interpreted as being part of the 
driving event, which perhaps induces the choice of the bare form for the definite 
description. 

Finally, when there is a producer-product relationship between the definite de- 
scription and the antecedent, the preference of definiteness marking strategy is 


However, speakers' intuitions regarding bridging examples vary specifically in the strength of 
preference for one definiteness marking strategy over optionality between the two. They do 
not vary with respect to which strategy is preferred, and when intuitions are strongest, the 
findings in the data align with Schwarz's and Jenks' findings in other languages. 


70 


2 Definiteness in Cuevas Mixtec 


the one of overt marking. The example below presents a scenario of a purchased 
book which necessarily has an author. Authors are not in part-whole relation- 
ships with books but producer-product relationships with them, so the nominal 
aütöor ‘author’ preferably takes on a definite article for association between the 
nominal's referent and an aforementioned book. 


(72) [tya Juaan] isyun ra [mn tūtū] 
the.sc.m Juan buy.compL 3sc.m one book 
‘Juan bought a book? 
a. [#(tya) autoor] kiu ra  [tya nüüx nuux yuku] 


the.sc.m author be.1pFv 3sG.M the.sc.m village face mountain 


"Ihe author was (one) from San Miguel Cuevas: 


As far as the data regarding regular nominals, definiteness marking strategy, 
and interpretation are concerned, there are few surprises, if any. The next sec- 
tion discusses cases of nominals which do stray from the patterns noted above, 
particularly by either overextending the usage of definite articles for expressing 
uniqueness or completely barring modification by definite articles. 


5 Internal variation in definiteness marking 


There are at least two other classes of nominal which do not display a pattern 
akin to that of the regular nominals described before. These other classes of nom- 
inal are small when compared to regular nominals which display the overt corre- 
spondence between encoding strategy and notion of definiteness. The irregular 
nominals require the presence of an overt definite article for both uniqueness 
and familiarity interpretations. The complex nominals do not occur with definite 
articles, perhaps because they seem to have one already morphologically built in, 
and so their bare forms serve for both uniqueness and familiarity interpretations. 
Table 3 summarizes the general correspondences between definiteness marking 
strategy and interpretation for all three classes. 


Table 3: Presence of overt definite article according to usage 


Nountype Uniqueness Familiarity 


Regular i y 
Irregular "4 V 
Complex t S 
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The differences displayed by these other classes of nominal are shown by com- 
parison with regular nominals in their distinct grammatical behavior with re- 
spect to some of Hawkins's (1978) usages of definite articles. These nominals 
display differences in the obligatoriness of the absence or presence of definite 
articles while undergoing immediate situation uses, larger situation uses, and 
anaphoric uses. Therefore, they reflect distinct styles of encoding either unique- 
ness or familiarity. There are only a handful of examples of these classes of nom- 
inal that this paper is able to provide, and the exact size of each class is yet to be 
determined. 


5.1 Irregular nominals and definiteness encoding 


While regular nominals display a predictable pattern of associating distinct no- 
tions of definiteness with distinct definiteness marking strategies, the distinction 
is not recognized in the morpho-syntax of irregular nominals. Irregular nomi- 
nals are so called because they differ from regular nominals in not exhibiting 
bare forms as definite descriptions, or rather, they do not feature bare forms as 
uniqueness definites as with regular nominals. While the strategy for the encod- 
ing of anaphoricity remains identical among these two classes of nominals, irreg- 
ular nominals extend the use of definite articles to also encode uniqueness. For 
example, the word yivi ‘people’ does not permit bare forms to serve as definite 
descriptions where other nominals can. The sentences below present a context 
where a man named Juan is visiting a village and is surprised by the disappear- 
ance of its inhabitants. In this case, yivi cannot take a bare form and must take a 
definite article in order for the sentence to be acceptable. 


(73) [tyà jJuáàn]isya'àn ra [nuux kaa] 
the.sG.m Juan go.coMPL3sc.M village over.there 
‘Juan went to the village over there’ 
a. suu koó ni indanii rā [ (nà) yivi] 
but NEG even find.coMPL 3sG.M the.HUM people 
‘But he did not find the people: 


A different result is reached if the irregular nominal is switched out for a reg- 
ular nominal such as verià'ü ‘church’. In a typical Mixtec village, there is one 
church dedicated to the local patron saint, whom the village also tends to be 
named after. In this case, the nominal takes a bare form because there is no pre- 
vious mention of a church to serve as an antecedent for the definite article. 
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(74) [tya jJuáàn]isya'àn ra [nuux kaa] 
the.sc.M Juan go.coMPL 3sG.M village over.there 
‘Juan went to the village over there’ 
a. süü koó nī indanii ra  ([(#na)  venu’a] 
but NEG even find.compL 3sG.M the.1na church 
‘But he did not find the/a church? 


The inventory of irregular nominals is not very large at all, and they all seem 
to have an interesting semantic similarity. Table 4 provides a list of the irregular 
nominals I have been able to recognize so far. 


Table 4: Irregular nominals 


Cuevas Mixtec English 
yivi people 
tyaa man 
na’a woman 


Notice that each nominal is a human predicate, such that the selection of pred- 
icates represented among the irregular nominals seems to be indicative of an 
animacy hierarchy. It would appear that the most animate predicates, human 
predicates in this case, form a special class that exploits the definite article for 
further uses beyond what is typical within the language. The influence of ani- 
macy hierarchies in grammar has been well documented (Dahl & Fraurud 1996), 
and there are clear examples of its interaction with definiteness in languages as 
common as Spanish. For Cuevas Mixtec, there seems to be a particular relation- 
ship between animacy and uniqueness in particular, which has been grammati- 
cized in a way that treats unique members of the highest rank in animacy as if 
they were familiar. 

It is important to note that, despite the seeming obligatoriness of the definite 
article in the presence of irregular nominals, the definite article is only oblig- 
atory for the expression of definiteness. These same nominals may occur with 
some other types of determiners without the definite article, such as with cer- 
tain quantifiers. Therefore, cases of definite articles on irregular nominals are 
not cases of prefixes. 
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(75) v [ta’an iin tyaa tyà kümí iin bárrá] kan ra ri 
every one man the.sc.M have.IpFV one donkey hit.IPFV 3sG.M 3SG.AML 


‘Every man that has a donkey hits it’ 


There are even some environments where the irregular nominal may shed off 
any prenominal material. The only environment where I have noticed this is 
that of the preverbal position while the nominal also takes a relative clause. The 
example below shows that the same nominal has an optional definite article in 
the preverbal position, but an obligatory article in the postverbal position. Both 
the preverbal position and the relative clause seem to be important for optionality 
of the definite article, and it is mysterious why this should be the case. 


(76) a. [(tya) tyaa tyà kátóó | ka'vi]  kuá'à và'à ka’vi ra 
the.sc.M man the.sc.M like.IpFv read mp much good read.IPFV 3sG.M 


‘The man who likes reading reads a lot. 


b. (Eu kua’a va’a ika'vi [*(tya) tyàà tyà kütóó 
yesterday much good read.compL  the.sc.M man the.sc.M like.IPFV 
ka’vi] 
read.IRR 


‘Yesterday, the man who likes reading read a lot. 


Generally, however, irregular nominals must take on definite articles ifthey do 
not co-occur with numerals or when they occur in preverbal position. They must 
even take on definite articles if they are modified by quantifiers. Many nominals 
are able to occur in partitive constructions without any modifying material be- 
sides the quantifier. In such constructions, nominals actually tend to have generic 
interpretations, at least if the modified nominal occurs in preverbal position. 


(77) V [sava tyina] ndé't 
half dog cry.ıprv 
‘Half of dogs bark’ 


(78) V [sava tyikuii] isyı’r [tyà Juaan] 
half water drink.compL the.sc.M Juan 
‘Juan drank half the water. 


Unlike regular nominals, irregular nominals are incapable of occurring with- 
out the definite article in the same environment. 
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(79) [sava "(nal tyaa] kumi nda’a 
half the.nuM man have.ıprv hand 
(‘Half of men have hands.) 


It may be worth noting other cases of definite articles occurring on unique- 
ness definites beyond the class of irregular nominals, so as to demonstrate the 
semantic complexity of interactions between definite articles and nominals more 
broadly. There are some very special cases of definite articles occurring on reg- 
ular nominals in order to make their referents more precise. The example below 
presents a case where a definite article is used to specify a member of a pair of 
unique individuals, rather than encode familiarity. The word märtöön ‘adminis- 
trator’ has only two possible referents in the village festival context, the male and 
female administrators. The occurrence of definite articles allows for precision as 
to which of them is being referred to. As a regular nominal, the word märtöon 
has the capacity to occur in a bare form as a uniqueness definite, but it also takes 
on the definite article only to make precise the gender of the referent. 


(80) isya'àn i mun  ndütütü [nà ndyáá chüün] 
go.COMPL 1sG where meet.Iprv the.HUM mind.ıprv work 
‘I went to where the festival organizers were meeting: 
a. tyà köö martóón ni  isyoo 
and NEG administrator even there.be.CoMPL 
*But the administrators were not there? 
b. tya koó [tyà märtööon] ni  isyoo 
and NEG ]the.sc.M administrator even there.be.cOMPL. 


"But the male administrator was not there. 


The example shows that definite articles serve many purposes beyond the for- 
mation of definite descriptions in the language. Since they also encode gender, 
it seems possible for some regions of the grammar of Cuevas Mixtec to exploit 
this aspect of their meaning while ignoring their tendency to also encode famil- 
iarity. Altogether, the data present a picture of overt definiteness marking in this 
language that complicates the narrow pattern observed for regular nominals. 


5.2 Complex nominals and definiteness encoding 


The last category of nominals to be discussed here are what will be called complex 
nominals. Complex nominals differ from both previously mentioned classes of 
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nominal in that they are barred from taking on definite articles. This may be 
due to the fact that, as compounds, they already feature a sort of built-in noun 
classifier. The only case of a complex nominal that this paper discusses is that of 
tyaxini ‘mayor’, as in the example below, which shows the unacceptability of a 
definite article. The word is a compound ofa noun classifier tyà and the word xini 
‘head’, such that they are inseparable in order to retain the meaning of ‘mayor’. If 
the complex nominal is switched out for a regular nominal like mäeströ ‘teacher’ 
in the same example, the option of attaching a definite article becomes available. 


(81 Context: There is a competition between the mayors of each village, 
followed by a separate competition between the teachers. 
a. [Ctya) tyaxini tyà isakana’a] kuani’u ra 
the.sc.M mayor the.sc.M win.coMPL go.home.IPFV 3SG.M 
"Ihe mayor that won went home: 
b. [(tya) mäeströ tya isákaná'à] kuanü’ü ra 
the.sc.M teacher the.sc.M win.coMPL go.home.IPFV 35G.M 


"Ihe teacher that won went home: 


As anominal, the word may be modified with numerals and indefinite articles 
despite the noun classifier. Even with the constraint against the occurrence of 
definite articles, complex nominals may still occur as familiarity definites. In the 
second sentence below, the nominal tyaxini ‘mayor’ has the interpretation of 
referring to the same mayor that was previously mentioned. 


(82) [tyà juáàn] indatü'ün rà ` syíín[mnn tyaxini] 
the.sc.M Juan  chat.coMPL 3sc.M with one mayor 
‘Juan chatted with a mayor..’ 
a. tyà tyaxini ikusii int rā 
and mayor cheerful.coMPr inside 3sc.M 


'and the mayor was happy: 


The rejection of definite articles for these items may have an explanation in the 
occurrence of a derivationally built-in noun classifier tyà. Compounding with 
classifiers occurs quite commonly across Mixtec and other Otomanguean lan- 
guages, though it is better understood as a diachronic phenomenon which has 
resulted in fossilized forms of classifiers (Macri 1983). Classifiers in compounds, 
or so-called lexical classifiers, are distinct from the grammatically active noun 
classifiers for all Mixtec languages. They constitute a much larger inventory with 
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meaning contributions that have been lost over time, and not all words of the lan- 
guage feature them. Lexical classifiers may co-occur with definite articles, sub- 
stantiating the claim that they are a distinct class of fossilized forms. 


(83) tyí tyi-xá'ü 
the.AML CLF-money 


“the goat? 


In addition, the noun classifier is actually interchangeable with other noun 
classifiers, in particular the plural human classifier na. This allows the nominal 
to take on a plural number meaning in what seems to be the only case of nominal 
inflection in this language. Likewise, this item is still unable to co-occur with a 
definite article. 


(84) Context: There is a gathering of villages. 
[sava (*nà) nàxini] ku’vi na 
half the.HuM mayors sick.IPFV 3.HUM 


‘Half the mayors were sick. 


Therefore, the classifier that occurs in the complex nominal is not quite the 
same as the lexical classifiers that have been more widely described for Mixtec 
languages. 

Beyond etymological considerations, the rejection of definite articles could 
also be explained from a semantic point of view. Mayors are of course relational 
nouns, or designations dependent on an individual’s relationship with something 
else. For someone or something to be a mayor, there must be a town for that indi- 
vidual to be a mayor of, perhaps automatically inducing a bridging environment 
with a part-whole relationship. Further investigation on other relational nouns 
would be necessary to substantiate this. It does seem to be the case that true re- 
lational nouns such as body parts also reject modification by definite articles. In 
contrast, body parts differ from tydxini in that they seem to be be averse to oc- 
currences as bare nominals and require at least some other form of modification. 


(85) Context: A teacher is overseeing a boy make a drawing of a man. The 
teacher takes a look at the boy’s progress, and notices that the head of 
the man is drawn disproportionately large. 


kon ndyá'a (*ñà) xini #(ra) 
big very ` ema head 3sG.M 
‘The head is too big: 
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The complex nominal therefore presents another challenge to the development 
of a homogenous account of definiteness in Cuevas Mixtec. There seems to be 
an active grammatical role that the noun classifier plays in the construction of 
a relational noun such as ‘mayor’ while avoiding the typical usage of noun clas- 
sifiers as encoding familiarity as definite articles. The data in this section also 
presented the case of obligatory definite articles on irregular nominals for cases 
where regular nominals would be bare, demonstrating the apparent influence 
of an animacy hierarchy on the distribution of definite articles. The existence of 
both classes of nominal complicates an account of definiteness encoding strategy 
as uniformly corresponding to the expression of either uniqueness or familiarity 
for Cuevas Mixtec. 


6 Conclusion 


This paper served as an presentation of the internal variation exhibited within 
Cuevas Mixtec with respect to strategies of definiteness marking, and what that 
variation may be the result of. The data support the findings of Schwarz (2009; 
2013) and Jenks (2015) that languages which feature distinct strategies for def- 
initeness marking will often associate those strategies with distinct notions of 
definiteness. One strategy will correspond to the expression of uniqueness, or 
the function of referring to an individual that uniquely fulfills the description 
provided by the noun. Another strategy will correspond to (strong) familiarity, 
or the function of creating an anaphor to previous linguistic expression in a dis- 
course. Schwarz (2013) and Jenks (2015) found that in many languages which fea- 
ture bare nominal definite descriptions in addition to overt definiteness marking, 
bare definite nominals will be interpreted as unique, while familiarity requires 
the overt marking. The pattern is replicated in Cuevas Mixtec, which has bare 
nominals serve as uniqueness definites in many contexts, and requires the occur- 
rence of overt definite articles for the expression of familiarity. This was shown 
by observing the grammatical constraints on definite descriptions within differ- 
ent semantic environments listed by Hawkins (1978). Bare definites are preferred 
in cases of larger situation and immediate situation uses of definite descriptions, 
environments which reinforce the uniqueness of the definite description's ref- 
erent. Nominals with overt definite articles were preferred in cases where the 
definite description was used as an anaphor, corresponding to the familiarity 
characterization of definite descriptions in the literature. Even the case of bridg- 
ing demonstrated the predicted correspondences between relationship type and 
preferred strategy of definiteness marking. Where the relationship between the 
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definite description and its antecedent was a part-whole relationship, the bare 
form seemed to be preferred. Where the relationship between the definite de- 
scription and its antecedent was a producer-product relationship, modification 
by overt definite articles seemed to be preferred. 

In contrast, the pattern explained above is only reserved for a large subset of 
the nominal inventory of Cuevas Mixtec. There are smaller classes of nominal 
which either lack the capacity to occur in bare forms for most contexts, or lack 
the capacity to take on definite articles. Those nominals that cannot shed the def- 
inite article were called irregular nominals, and they appear to retain the capacity 
to express uniqueness despite the presence ofthe article. Irregular nominals were 
shown to retain the definite article in immediate situation uses of definite descrip- 
tions, and unable to shed them even in the presence of other modifiers such as 
quantifiers. The definite article was shown not to be a prefix because there are en- 
vironments where it may disappear, such as when a numeral occurs in its place. 
All of the irregular nominals seem to be predicates of humanity of some sort, 
meaning ‘man’, ‘woman’, or ‘people’. The data therefore suggest an interaction 
between overt definiteness marking, especially uniqueness marking, and an ani- 
macy hierarchy. Irregular nominals contrast with complex nominals, which seem 
to not take definite articles at all. Complex nominals included the relational noun 
‘mayor’, which more frequently undergoes uses as a uniqueness definite. Exam- 
ples of complex nominals are difficult to encounter, so a much more thorough 
study of this class is necessary to determine all the semantic properties involved 
in the inventory. Ultimately, the data show that if we are to assume an account 
of the semantics of definiteness along the lines of Schwarz and Jenks, there must 
also be some account for the nominal contribution in how definiteness marking 
preferences are determined. 
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HUM plural human SPEC  specifier 

INA  inanimate STR ` plant/structure 
LIQ liquid YTH youth 
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Strong vs. weak definites: Evidence from 
Lithuanian adjectives 
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While Lithuanian (a Baltic language) lacks definite articles, it can use an adjectival 
system to encode definiteness. Adjectives can appear in a bare short form as in 
graži ‘beautiful.Nom.F.sc’ and a long form with the definite morpheme -ji(s) as in 
graZio-ji ‘beautiful.NoM.F.sG-DEF’. In this paper, I explore definiteness properties of 
Lithuanian nominals with long and short form adjectives. Recent cross-linguistic 
work identifies two kinds of definites: strong definites based on familiarity and 
weak definites licensed by uniqueness (Schwarz 2009; 2013; Arkoh & Matthewson 
2013; Jenks 2015; i.a.). Following this line of work, I argue that short form adjec- 
tives, in addition to being indefinite, are also compatible with situations licensed 
by uniqueness, and in this way resemble weak article definites. Long form adjec- 
tives pattern with strong article definites, as evidenced by familiar definite uses and 
certain bridging contexts parallel to the German data (Schwarz 2009). This study 
provides novel evidence for the distinction between strong versus weak definites 
showing that this distinction is not necessarily reflected in determiner patterns, 
but it can also be detected in the adjectival system. 


1 Introduction 


There is a tradition in the literature to define definiteness either in terms of 
uniqueness (Russell 1905; Strawson 1950; Frege 1892) or in terms of anaphoric- 
ity (familiarity) (Christophersen 1939; Kamp 1981; Heim 1982). Nevertheless, a 
detailed study of German articles by Schwarz (2009) demonstrates that both fa- 
miliarity and uniqueness are necessary tools to capture definite uses. Specifically, 
Schwarz provides empirical evidence showing that there are two semantically 
distinct definites in German: a strong article definite licensed by familiarity and 
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a weak definite licensed by uniqueness. The distinction between the two articles 
is visible not only in anaphoric and uniqueness-based contexts, but also in bridg- 
ing contexts where a part-whole relation is licensed by the weak definite article, 
and the product-producer context is compatible with the strong definite article. 
The dichotomy of strong and weak definites has been supported by a number of 
other studies from different languages including: Akan (Arkoh & Matthewson 
2013), ASL (Irani 2019 [this volume]), Austro-Bavarian (Simonenko 2014), and 
Icelandic (Ingason 2016). 

This paper is the first attempt to bring into the discussion of strong versus 
weak definites articleless languages like Lithuanian, which uses the adjectival 
system as one of the means to express definiteness. While Lithuanian lacks def- 
inite articles, it has the suffix -ji(s) associated with definiteness (Ambrazas et al. 
1997). This definite morpheme appears on a variety of non-NP categories, but 
for present purposes I focus on adjectives. Adjectives can appear in a bare short 
form as in (la) and a long form with a definite morpheme -ji(s) as in (1b). Gillon & 
Armoskaite (2015) report that the nominals with short adjectives can be definite 
or indefinite depending on the context, while nominals with long adjectives are 
necessarily interpreted as definites, as reflected in the glosses in (1). 


(1) a graz-i mergin-a 
beautiful-NoM.r.sc girl-NOM.F.SG 
'a/the beautiful girl 
b. graz-io-ji mergin-a 
beautiful-NOM.F.sG-DEF girl-NOM.F.sG 
“the beautiful girl’ 


In this study, I provide novel evidence for the distinction between strong ver- 
sus weak article definites (Schwarz 2009) by exploring definiteness properties of 
Lithuanian nominals with short and long adjectives. In particular, I demonstrate 
that long form adjectives function like familiar definites, and are equivalent to 
the German strong article, as they emerge in anaphoric expressions that refer 
back to linguistic antecedents (2). This reference otherwise is not possible with 
short form adjectives. The long forms pattern with the strong article in German 
not only in standard anaphoric cases, but also in product-producer bridging con- 
texts as will be illustrated in $4. 


(2 Marija pristaté mane savo pusbroliui is Vilniaus. GraZus-is  / 
Marija introduced me self cousin from Vilnius beautiful-DEr / 
*graZus pusbrolis galantiskai nusilenké ir pabučiavo man į ranką. 

beautiful cousin gallantly bowed and kissed me to hand 
“Marija introduced me to her cousin from Vilnius. The beautiful cousin 


gallantly bowed and kissed my hand’ 
84 


3 Strong vs. weak definites: Evidence from Lithuanian adjectives 


While the nominals with short form adjectives can indeed function like indef- 
inites by introducing a new discourse referent, I provide new data showing that 
they can also occur in situations licensed by uniqueness as evidenced by larger 
situations based on general world knowledge, e.g. generic rules as in (3). This 
observation suggests that short adjectives pattern in a similar way to the weak 
definite that is associated with uniqueness. The similarity of short adjectives with 
weak definites is further supported by the felicity of short forms in part-whole 
bridging contexts, which in German also require the weak article (see 84). 


(3) Praéjus dviem savaitém po rinkimy, prezidentas turi teise atleisti nauja 
passed two weeks after elections president has right fire new 
/ tnaujq-jj ministrą pirmininkq tik išskirtiniais atvejais. 
/ new-DEF minister prime only exceptional cases 
"Two weeks after the election, the president has a right to fire the new 
prime minister only in exceptional cases. 


Nevertheless, a difference between Lithuanian and German occurs in larger sit- 
uations that include specific unique individuals. German permits only the weak 
article in such a context, whereas Lithuanian uses the long form adjective as in 
(4). A similar type of distinction is also observed by Jenks (2015) between bare 
nouns versus definite demonstratives and pronouns in Thai. 


(4 Po rinkimy naujas-is / #naujas prezidentas paskambino miestelio 
after elections new-DEF / new president called city 
merui. 
mayor 
‘After the elections, the new president called the city mayor. 


Overall, the Lithuanian data provide additional support for Schwarz's (2009) 
proposal that definiteness is a two-fold phenomenon consisting of uniqueness 
and anaphoricity that can be expressed by two separate forms/articles in a lan- 
guage. The adjective-based definite expressions presented here broaden the ty- 
pological landscape on how languages encode strong vs. weak article distinction 
by demonstrating that this distinction is not necessarily reflected in determiner 
patterns, but it can also be detected in the adjectival system. The Lithuanian data 
included in this paper have been tested with 7 informants who worked with the 
author, who is also a native speaker of Lithuanian. In addition to that, an online 
survey with 20 additional native speakers has been carried out. This was a ques- 
tionnaire study on Google Forms where the speakers had to read a sentence and 
select an appropriate adjective that sounded the most felicitous in a given con- 
text. While a number of instances show a very clear semantic contrast between 


85 


Milena Sereikaité 


long and short adjectives, the results from other examples exhibit a certain de- 
gree of variation. Particularly, this arises in the contexts that are compatible with 
both familiar and uniqueness uses. Indeed, Schwarz (2019 [this volume]) notes 
that there exist contexts where strong versus weak distinction can be blurry and 
languages show some variation with respect to which definite form is used. I 
will review the variation patterns exhibited by the data and discuss what conse- 
quences they have for the theory. 

This paper is structured as follows. In 82, the main typological facts of nom- 
inals with short and long adjectives will be presented. In $3, I review different 
approaches that have been used to capture definite uses with a particular focus 
on Schwarz's (2009) proposal and studies supporting it. $4 compares the defi- 
nite use of short and long adjectives with strong and weak articles in German 
illustrating the parallels between the two languages. It is demonstrated that the 
long form enforces familiarity just like the strong article does in German, and the 
short form is compatible with uniqueness in a similar way to the weak article in 
German. 85 concludes. 


2 Typological background 


This section describes the basic patterns of the way Lithuanian marks definite- 
ness in relation to other languages. Lithuanian lacks (in)definite articles, and 
thereby a bare noun is ambiguous between definite and indefinite readings as 
in (5). Article-less languages, like, for example, most Slavic languages, have been 
argued to have a DP layer with an empty D category (Rappaport 1998; Leko 1999; 
Pereltsvaig 2007; i.a.). However, this proposal has been challenged by a number 
of researchers (Bošković 2009; 2012; Bošković & Gajewski 2011; Despić 2011; i.a.) 
claiming that nominals in these languages are simply NPs. The recent work on 
Lithuanian indicates that even though no overt article is present within a nomi- 
nal, at least definite expressions are always DPs (Gillon & Armoskaite 2015). 


(5 ` mergin-a 
girl-NOM.F.SG 
‘a/the girl’ 


Nevertheless, Lithuanian has some morphological means to mark definiteness, 
namely the suffix -ji(s). I will call this suffix a definite form. The definite form 
cannot be attached to nouns as shown in (6). 
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(6) * mergin-a-ji 
girl-NOM.F.SG-DEF 
Int. ‘the girl’ 


The suffix -ji(s) occurs with non-NP categories, e.g. adjectives, recall our min- 
imal pairs from (1) repeated here in (7).? The traditional Lithuanian Grammar 
(Ambrazas et al. 1997: 142) defines the short form as indefinite, “unmarked”, and 
the long form as definite, “marked”. Gillon & Armoskaite (2015) show that both 
forms can in fact be definite. 


(7 a graz-i mergin-a 
beautiful-NoM.r.sc girl-NOM.F.SG 
‘a/the beautiful girl’ 
b. graz-io-ji mergin-a 
beautiful-NOM.F.SG-DEF girl-NOM.F.sG 
‘the beautiful girl’ 


Lithuanian, at least typologically, is different from some Slavic languages that 
have a definite suffix. For example, Bulgarian, unlike Lithuanian, has an option 
to attach the definite suffix -ta to a noun (8a) as well as to an adjective (8b). 


(8) Bulgarian 
a. kniga-ta 


book-DEF 
‘the book’ 


b. xubava-ta kniga 
nice-DEF book 
‘the nice book’ 


The Lithuanian short vs. long adjective pairs are cognate with short and long 
adjective forms found in Serbo-Croatian (see Aljovic 2010 and references therein) 
and Old Church Slavonic (Sereikaité 2015). The definite suffix -ji(s) is originally 
a pronominal form (Ulvydas 1965; Stolz 2008) where ‘jis’ stands for ‘he’ and fi’ 


‘Other categories that can take the definite form are: pronouns like mana ‘mine’ vs. mano-ji 
“mine-DEF’, demonstratives ta ‘that’ vs. to-ji ‘that-DEF’, relative pronouns kuri ^who/which' vs. 
kurio-ji ‘who/which-per’, etc. For a full list see Stolz (2008: 223-224). 

"The definite form -ji(s) is subject to elision. The glide j is omitted before the sibilant consonant 
/s/ as in e.g. graZ-us ‘beautiful-NoM.sc.m’ + jis = graZus-is ‘the beautiful’. 
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stands for'she'? Both short and long adjectives agree with the noun as indicated 
in (7). The definite form -ji(s) also shows agreement in number, gender and case 
with the noun as illustrated in Table 1 for both singular and plural masculine 
forms. However, for the reader's convenience and for the matter of space, I gloss 
-ji(s) as DEF. 


Table 1: Inflectional paradigm of short and long adjectives of jaunas 
'young' (adapted from Stolz 2008) 


jaun-as-M.SG | jaunas-is-M.SG-DEF jaun-i-M.PL ` jaunie-ji-M.PL-DEF 


NOM  Jjaun-as Jaun-as-is Jaun-i jaun-ie-j-i 

GEN ` jaun-o jaun-o-j-o Jaun-y jaun-y-j-y 

DAT jaun-am jaun-a-j-am jaun-iems jaun-ies-iems 
ACC jauną jaun-q-j-i jaun-us jaun-uos-i-us 
INST  jaun-u jaun-uo-j-u jaun-ais jaun-ais-i-ais 
LOC jaun-ame jaun-a-j-ame jaun-uose jaun-uos-i-uose 


In this paper, I will be looking at the instances with a single adjective, be it a 
short form or a long form. For completeness, observe that the occurrence of two 
long adjectives with a definite meaning is judged as odd at least in default cases 


(9b). 4 


(9) ^a  graz-us sen-as lok-ys 
beautiful-NoM.M.sc old-NOM.M.sG bear-NOM.M.SG 
‘a beautiful old bear’ 
b. ?? grazus-is senas-is lokys 
beautiful-NOM.M.SG-DEF old-NOM.M.SG-DEF bear-NOM.M.sG 
‘the beautiful old bear’ 


>There are several theories about the origin of the definite form -ji(s). Stolz (2008) argues that 
the definite marker used to function as a relative pronoun in preliterate times, while Rosinas 
(1988) suggests that this definite marker is a “postposed deictic pronoun". In Valeckiené (1986), 
definite forms are treated as apposition constructions where the definite form is the apposition 
proper. 

^Note that in formal written contexts or contexts that require emphasis/exaggeration the occur- 
rence of two long forms is acceptable. Not only the discourse plays a role, but also prosody. The 
examples in (9b) are judged as grammatical when there is a pause between the two adjectives. 
Ithank Solveiga Armoskaite (personal communication) for bringing this up to my attention. 
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Thereby, Lithuanian, at least in standard, discourse-neutral cases, does not per- 
mit multiple definite forms in the context of a definite noun phrase? unlike for 
example Greek (see Alexiadou (2014) and references therein) which is known for 
multiple marking of definiteness (10). 


(10) Greek (Alexiadou 2014: 19) 
to vivlio to kokino to megalo 
the book the red the big 
‘the big red book’ 


The definite suffix can also be used to refer to kinds (Rutkowski & Progovac 
2006). The short adjective simply denotes a bear that happens to be white as in 
(11a). In contrast, the long adjective is ambiguous between the definite reading 
and the kind reading expressing a certain species of bears, namely the polar bear 
Ursus maritimus, as in (11b). 


Nevertheless, Stolz (2008) gives the example in (i.a) and claims that two definite adjectives 
can in fact occur together. Note that this instance includes coordination. It might be that the 
first adjective has been accompanied by a noun which then has been elided. Observe that the 
example becomes ungrammatical in default cases without the conjunct (i.b). 


(i) a. Trūksta greta nuostabiy-jy ir graziy-jy 
lack.prs.3 near wonderful.GEN.F.PL-DEF and beautiful.GEN.F.PL-DEF 
atstoviy 
representatives.GEN.F.PL 


"Ihe wonderful and beautiful representatives are missing. (adapted from Stolz 
2008: 226) 
b. * Trūksta greta nuostabiy-jy graziy-jy 
lack.PRs.3 near wonderful.GEN.F.PL-DEF beautiful.GEN.F.PL-DEF 
atstoviy 
representatives.GEN.F.PL 


"Ihe wonderful and beautiful representatives are missing’ (adapted from Stolz 
2008: 226) 


$An anonymous reviewer asks how nominals without modifiers express kinds in Lithuanian 
in general. Bare nominals can be kind-denoting. However, their use is restricted. Bare plural 
nominals are compatible with kind-denoting predicates like extinct, whereas bare singulars are 
not as exemplified below. 


(i) a. Tigrai greitai išnyks. 
Tigers.NOM.M.PL soon extinct.FUT.3 
"Tigers will extinct soon’ 
b. & Tigras greitai i$nyks. 
TigerNoM.M.sc soon extinct.FUT.3 
Int. “The tiger will extinct soon! 
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(11) a. balt-as lok-ys 
white-NOM.M.SG bear-NOM.M.SG 
'a/the white bear’ 
b. balt-as-is lok-ys 
white-NOM.M.SG-DEF bear-NOM.M.SG 
(i) ‘the white bear’ v definite reading 
(ii) ‘the polar bear’ Ykind reading 


Interestingly, a long adjective with a definite meaning and a long adjective 
with a kind interpretation can be stacked together (12). Observe that the definite 
meaning of ‘white’ in default cases is disfavored. Sereikaité (2017) argues that in 
Lithuanian a combination of a kind-level adjective and a noun syntactically is 
similar to a phrasal compound, whereas a definite adjective and a nominal do 
not function like a single syntactic unit. Instead, the definite adjective behaves 
like a modifier of a nominal. 


(12) graz-us-is balt-as-is lok-ys 
beautiful-NOM.M.sG-DEF white-NOM.M.SG-DEF bear-NOM.M.SG 
(i) ‘the beautiful polar bear’ 
(ii) ?? ‘the beautiful white bear’ 


Having presented the main typological facts on nominals with adjectives, I 
now turn to the theoretical discussion on two types of definites. 


3 Two types of definites 


This section describes different approaches that have been used to define defi- 
niteness. There has been extensive debate in the literature whether definiteness 
should be characterized by uniqueness or by familiarity. On the one hand, def- 
inite articles in expressions like the moon in (13) are argued to be licensed by 
uniqueness and no prior mention of the referent is necessary (Russell 1905; Straw- 
son 1950; Frege 1892). The earlier versions of this approach, e.g. Strawson’s (1950) 
work, that assume “absolute” uniqueness are problematic for instances that in- 
volve situational uniqueness. As mentioned by Schwarz (2013), there is a number 
of situations where the descriptive content of the definite expression holds true 
for more than one entity in the world. For example, the definite description the 
projector is used in (14), even though there is more than one projector existing in 
the world. 
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(13) Armstrong was the first man to walk on the moon. 


(14) Context: Said in a lecture hall containing exactly one projector. 
The projector is not being used today. (Schwarz 2013: 537) 


On the other hand, definite articles can be viewed as expressing anaphoricity, 
also often referred to as familiarity (Christophersen 1939; Kamp 1981; Heim 1982). 
Under this approach, definite nominals are anaphoric and need to be linked to a 
previously mentioned discourse referent. This is the so-called strong familiarity 
in Roberts's (2003) terms. While this anaphoricity-based analysis captures some 
of the uses of definite articles, it is still unclear how such an approach would 
account for cases as (15) that lack a prior mention of the definite description and 
instead include global familiarity. 


(15) John bought a book and a magazine. The book was expensive. (Schwarz 
2013: 537) 


Several attempts have been made to propose a mixed view of both approaches 
that would use both uniqueness and familiarity to license definites (Kadmon 1990; 
Farkas 2002; Roberts 2003). The hybrid view of definiteness requires different 
analyses for different uses of definites, and thus conceptually is somewhat a less 
desirable outcome. Nevertheless, this approach has been empirically supported 
by recent cross-linguistic work suggesting that neither the purely uniqueness- 
based approach nor the anaphoricity-based analysis can fully account for the 
full paradigm of definite uses. 

One of the main empirical studies that supports the hybrid approach comes 
from Schwarz (2009; 2013). Schwarz shows that German has two types of defi- 
nite articles that correspond to two semantically distinct definites. The weak defi- 
nite contracts with a preposition in certain environments and the strong definite 
does not. Schwarz demonstrates that the weak definite is licensed by uniqueness 
and the strong definite is licensed by familiarity." (16) involves a globally unique 
situation, and the contracted form zum, namely the weak definite, is felicitous. 
On the other hand, the non-contracted form in dem, thus the strong definite, is 
used with nominals that are anaphoric with preceding expressions as in (17). The 
strong vs. weak distinction has been shown to hold true in other environments 
that involve either unique definites or familiar definites e.g., different cases of 
bridging, larger situations or immediate situations (see $4 for some examples of 
these uses). 


7I gloss the weak article definite as Dweak and the strong article definite as D 


strong. 
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(16) German (Schwarz 2009: 40) 
Armstrong flog als erster zum /#zudem Mond. 
Armstrong flew as first one to-theweak / to-thestrong moon. 
"Armstrong was the first one to fly to the moon 


(17) German (Schwarz 2009: 30) 
In der New Yorker Bibliothek gibt es ein Buch über Topinambur. 
in the New York library exists EXPLa book about topinambur 


Neulich war ich dort und habe ttim /indem Buch nach einer 
recently wasI there and have in-theweak / in thestrong book for an 
Antwort auf die Frage gesucht, ob man Topinambur grillen 
answer to the question searched whether one topinambur grill 

kann. 

can 


‘In the New York public library, there is a book about topinambur. 
Recently, I was there and looked in the book for an answer to the 
question of whether one can grill topinambur. 


To encode these uses of definites, Schwarz (2009; 2013) proposes the follow- 
ing analysis. The denotation of the weak article introduces a unique referent in 
a given situation as in (18) thereby capturing the situational uniqueness, which 
has been problematic for the early proponents of the uniqueness approach. The 
strong article definite defined in (19) not only has a unique referent, but also 
includes an additional argument that is identical to previously introduced indi- 
vidual within a certain situation/context. Both the strong and weak articles are 
related: the strong article is a combination of the weak article plus the anaphoric 


link. 


(8) [[Dweak]] = As,-AP.ix.P(x)(s,) (Schwarz 2009: 264) 


(19) [[Dstrong] = Asr-AP.Ay.ıx.P(x)(sı) ^ x=y (Schwarz 2009: 260) 


Schwarz's proposal that there are two semantically distinct articles in natural 
language has been supported by recent work. Note that English does not show 
morphological distinction and uses the for both types of definites as in (20). 


(20) Amy bought a book about the weak sun. Thestrong book was expensive. (In- 
gason 2016: 115) 
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However, a number of other languages employ different types of morphosyn- 
tactic means to express different definite uses. For instance, Ingason (2016) ar- 
gues that Icelandic parallels with German in having two distinct phonological 
exponents for two semantically distinct definites. In general, the article in Ice- 
landic is usually expressed as a suffix attached to a noun in both anaphoric and 
uniqueness-based contexts. Nevertheless, the morphological distinction between 
two types of definite uses emerges in the presence of evaluative adjectives. In sit- 
uations that include an evaluative adjective intervening between a determiner 
and a noun, the free article HI is used. Specifically, the free article functions as 
a unique definite and corresponds to the weak article in German as in (21). This 
article cannot be used anaphorically, and instead the demonstrative is used in 
this type of environment as illustrated in (22). The demonstrative, thus, behaves 
like the strong definite in German. 


(21) Icelandic (Ingason 2016: 123) 
Context: First mention of the World Wide Web. 
Tim Barners Lee kynnti heiminn fyrir hinum  / tpessum 
Tim Berners Lee introduced world be to Hl-theweak / thisstrong 
ótrülega veraldarvef. 


amazing world.wide.web 
"Tim B. Lee introduced the world to the amazing World Wide Web: 


(22) Icelandic (Ingason 2016: 133) 
Hun fékk engin góð svör fra #hinum / pessum hræðilega 
she got no good answers from HI-theweak / thiSstrong terrible 
stjórnmálamanni. 
politician 
‘She got no good answers from the terrible politician’ 


In addition, Fering Frisian (Ebert 1971) and Austro-Bavarian (Simonenko 2014) 
have also been reported to have two distinct morphological forms to express both 
definites in this respect resembling German and Icelandic. 

Another important case worth mentioning comes from Akan (Kwa, Niger- 
Congo). Akan, unlike German, has only one overt form used for one of the defi- 
nites. According to Arkoh & Matthewson (2013), the weak definite article is real- 
ized as zero, and thus bare nominals are used in this context (23). Nevertheless, 
Akan employs an overt form for anaphoric uses, namely the demonstrative no, 
as in (24), equivalent to the German strong article. 
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(23) Akan (Arkoh & Matthewson 2013: 2) 


Amstrin nyí nyímpá áà 6-dzi-i kan tu-u k5-3  Osirán 
Armstrong is person REL 3.sG.sBJ-eat-PST first uproot-PST go-PST moon 
Ø dò. 


theweak top 
‘Armstrong was the first person to fly to the moon’ 


(24) Akan (Arkoh & Matthewson 2013: 2) 
MÓ-t$-à ékütü. Ékütü nó yè dew papa. 
1.56.sBJ-buy-PST orange orange thestrong be nice good 
‘I bought an orange. The orange was really tasty: 


Similarly to Akan, numeral classifier languages like Thai also have been shown 
to employ bare nominals to express weak definites as in (25), whereas the strong 
definite expressions are encoded by demonstratives or overt pronouns as in (26) 
(Jenks 2015). 


(25) Thai (Jenks 2015: 106) 
duan-can (#duay nan) sawaan maak 
moon cLF that bright very 
"Ihe moon is very bright: 


(26) Thai (Jenks 2015: 112) 
Previous discourse: “Yesterday I met a student... 
(nákrian) khon nán / (kháw) chalàat máak 
student cp that / 3P clever very 
"That student / (s)he was clever! 


All in all, empirical evidence from these languages draws a new perspective on 
definiteness showing that definiteness is a two-fold phenomenon. Both unique- 
ness and familiarity are necessary tools to capture different uses of definite de- 
scriptions. These findings make the hybrid approach the most accurate account 
of all the existing approaches so far. This approach will also be supported by the 
Lithuanian data presented in the subsequent section. 


4 Strong vs. weak distinction in Lithuanian 


In this section, I explicitly discuss the occurrence of Lithuanian nominals with 
long and short adjectives in familiar and unique definite environments, and bridg- 
ing contexts based on the examples from Schwarz (2009). I demonstrate that the 
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nominals with two distinct adjective forms correspond to the two distinct defi- 
nite uses, namely familiar uses and unique uses. The long adjective with the def- 
inite morpheme -ji(s) is analogous to the German strong article and is licensed 
by familiarity — recall our original example (2), repeated here in (27). The short 
form adjective, in addition to its indefinite use, is compatible with uniqueness 
(3), repeated in (28). From now on, the short form will be glossed as weak and 
the long form will be glossed as a strong definite. For the reader's convenience, 
I provide glosses only for expressions under the discussion. To draw clear paral- 
lels between nominals with long and short adjectives, and the strong and weak 
articles, the Lithuanian data will be compared with German. 


(27) Marija pristaté mane savo pusbroliui iš Vilniaus. 
Marija introduced me self cousin from Vilnius 
Gražus-is / #gražus pusbrolis galantiškai nusilenkė ir 
beautiful-DEFstrong / beautifulweak cousin gallantly bowed and 
pabučiavo man į ranką. 
kissed me to hand 
“Marija introduced me to her cousin from Vilnius. The beautiful cousin 
gallantly bowed and kissed my band 


(28) Praėjus dviem savaitém po rinkimų, prezidentas turi teisę atleisti 
passed two weeks after elections, president has right fire 


naują /#nauja-ji ministrą pirmininką tik išskirtiniais 
N€Wweak / N€W-DEF strong minister prime only exceptional 
atvejais. 

cases 


"Two weeks after the election, the president has a right to fire the new 
prime minister only in exceptional cases: 


This study gives additional insights into the debate on how definiteness should 
be characterized, and also broadens the typological landscape of how languages 
express the two definites. The exploration of nominal expressions accompanied 
by adjectives shows that Lithuanian typologically belongs to the group of lan- 
guages like Akan (cf. 23-24) or Thai (cf. 25-26) since it uses a bare form, the 
short adjective, in situations with a unique referent, and it has one marked form, 
namely the long adjective, that is equivalent to the strong article in German. 
At the same time, Lithuanian manifestation of definiteness through adjectival 
system resembles Icelandic which also exhibits the strong vs. weak distinction 
whenever evaluative adjectives intervene between D/n categories (cf. 21-22). 
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Before I proceed to our discussion of definites, a couple of general remarks re- 
garding definiteness in Lithuanian should be kept in mind. As has been illustrated 
by Gillon & Armoskaite (2015), a number of factors can affect the definiteness of 
a nominal, e.g. word order or aspect. The basic word order in Lithuanian is SVO. 
The syntactic position that has been reported to be mostly neutral with respect to 
definiteness is the initial subject position. Even though the definite interpretation 
is slightly preferred for the initial subject, both definite and indefinite readings 
are available depending on the context (29). 


(29) Zmog-us atvyk-o. 
human-NOM.M.sG arrive-PsT.3 
"Ihe/a man arrived? (Gillon & Armoskaite 2015: 74) 


The interpretation of the object in SVO instances is dependent on the aspect. 
The imperfective aspect, which is unmarked, permits both definite or indefinite 
readings of the object depending on the context (30a). In contrast, the perfective 
aspect, which is realized with a prefix on a verb, requires the object to be definite, 


(30b). 


(30) a. Jon-as valg-é  obuol-j. 
Jonas-NOM.M.SG eat-PST.3 apple-Acc.M.sc 
‘Jonas ate the/an apple! (Gillon & Armoskaite 2015: 75) 
b. Jon-as su-valg-é obuol-j. 
Jonas-NOM.M.SG PRF-eat-PST.3 apple-Acc.M.sc 
‘Jonas ate up the/#an apple’ (Gillon & Armoskaite 2015: 76) 


In order to ensure that the (in)definiteness of nominal expressions that we are 
testing is purely dependent on the context and is not influenced by the afore- 
mentioned factors, the examples are set up in such a way that the target nominal 
expression appears in a subject initial position. The cases where the tested nomi- 
nals appear in the object position will include the imperfective aspect which does 
not reinforce the definite reading. Lastly, recall from 82 that nominals with long 
adjectives can have either definite or kind-level interpretations (11b), repeated 
here with the original glosses in (31). The nominals in our examples will include 
evaluative adjectives like strange or classifying adjectives such as young which 
lack a kind-level interpretation and provide a good testing ground for (in)definite 
interpretation of nominals. 
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(31) balt-as-is lok-ys 
white-NOM.M.SG-DEF bear-NOM.M.SG 
(i) ‘the white bear’ vd definite reading 
(ii) ‘the polar bear’ v kind reading 


Having said that, I now review the basic descriptive facts that have been asso- 
ciated with short and long forms in the literature. 


4.1 Definite vs. indefinite noun phrases with adjectives 


In this sub-section, I show that nominals with short form adjectives can have an 
indefinite reading whereas those with long form adjectives cannot. The Lithua- 
nian Grammar (Ambrazas et al. 1997) defines the short form adjective as in- 
definite/unmarked and the long form adjective with the definite suffix as defi- 
nite/marked. Indeed, nominals accompanied by short adjectives can be used to 
introduce a new discourse referent, a typical function of indefinites as in (32). 
The nominal with short form strange is used here to introduce a discourse-new 
information, i.e. the stranger that my friend has never heard about. Nominals 
with long adjectives, in contrast, are infelicitous in this context (32). 


(32) Context: I am telling Mary for the first time about my evening at the bar 
where I have met a stranger that I have never seen before. 
Vakar bare sutikau keistq / #keistq-ji vaikinq. 
yesterday bar met strangeweak / strange-DEFstong BUY 
"Yesterday, at the bar, I met a strange guy: 


The long form is acceptable in cases that include a prior mention of the lin- 
guistic antecedent (33). This suggests that nominals with long adjectives enforce 
an anaphoric interpretation which is a common feature of definite expressions. 


(33) Context: I have heard about a strange guy from Mary. Finally, yesterday I 
was able to meet that guy and now I am telling this story to Mary. 
Vakar bare sutikau keistq-ji vaiking. 
yesterday bar met _strange-DEFstong BUY 
"Yesterday, at the bar, I met the strange guy. 


Another environment showing the same pattern is existential sentences with a 
post-verbal subject. The subject in this construction can only be indefinite (Gillon 
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& Armoskaite 2015). While nominals with short adjectives are possible in this 
environment, nominals with long adjectives are not (34). This pattern is further 
evidence that short adjectives can behave like indefinites, in contrast to long 
adjectives that lack this function. 


(34) Context: I have heard a rustling sound in the bushes, I went closer and... 
Ten buvo grazi / #grazio-ji kate. 
there was beautifulweak / beautiful-DEFstrong cat 
"Ihere was a beautiful cat. 


Taking these facts into account, at the first blush, there seems to be a sharp 
contrast between nominals with short and long form adjectives in terms of their 
(in)definite use. Nominals with short form adjectives occur in indefinite environ- 
ments. In contrast, the presence of a long adjective in nominal expressions is 
incompatible with an indefinite context, and instead is licensed by linguistic an- 
tecedents exhibiting the behavior of strong, familiarity definites to which I now 
turn to. 


4.2 Familiarity 


Familiarity definites are referential expressions licensed by an anaphoric link to 
a preceding expression. In German, as has already been discussed, the strong 
article, the non-contracted form, is used in such cases (17), repeated here in (35). 


(35) German (Schwarz 2009: 30) 
In der New Yorker Bibliothek gibt es ein Buch über Topinambur. 
in the New York library exists EXPLa book about topinambur 
Neulich war ich dort und habe #im /indem Buch nach 
recently was I there and have in-theweak / in thestrong book for 
einer Antwort auf die Frage gesucht, ob man Topinambur grillen 
an answer to the question searched whether one topinambur grill 
kann. 
can 
‘In the New York public library, there is a book about topinambur. 
Recently, I was there and looked in the book for an answer to the 
question of whether one can grill topinambur: 


For the anaphoric reference, Lithuanian employs a nominal with a long form 
adjective. The first sentence in both examples in (36-37) introduces a new individ- 
ual which is expressed by a bare nominal. In the subsequent sentence in (36-37), 
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that individual is mentioned for the second time and this time it is accompanied 
by an adjective. Only the long form adjective is possible in these situations and 
the short form adjective is infelicitous. The use of the long adjective in these ex- 
amples is parallel to the use of the strong article in German in the anaphoric 
context as in (35). 


(36) Nejtikétina, vakar meno galerijoje vaizdo kameros uzfiksavo katinq. 
incredible yesterday art gallery screen cameras captured cat. 
Keistas-is / #keistas katinas nepabügo žmonių ir 
strange-DEF strong / Strangeweak cat not-scared people and 
vaikščiojo po parodq it tikras meno Zinovas. 
walked through exhibition as real art connoisseur 
‘Incredible, yesterday in the art gallery, cameras captured a cat. The 
strange cat was not afraid of people and walked through the exhibition 
as a true art connoisseur. 


(37) Marija pristaté mane savo pusbroliui iš Vilniaus. 
Marija introduced me self cousin from Vilnius 
Grazus-is / #graZus pusbrolis galantiskai nusilenké ir 
beautiful-DEF strong / beautifulwea, cousin gallantly bowed and 
pabučiavo man į ranką. 
kissed me to hand 
“Marija introduced me to her cousin from Vilnius. The beautiful cousin 
gallantly bowed and kissed my band 


Nevertheless, not all cases are that transparent. Examples like (38) present a 
situation where both the linguistic antecedent and its anaphoric expression are 
identical. The newly introduced antecedent in the first sentence in (38) takes the 
short form adjective, which, as discussed above, can function as indefinite. The 
anaphoric expression in the following sentence in (38) can appear in the long 
form as expected, given that the long form encodes anaphoricity. However, the 
short form is not completely ruled out here as well. While 18 out of 27 speakers 
selected the long form, the rest of the speakers allowed the short form as well. It 
can be hypothesized that the short form is available in this situation because it is 
used as a unique definite assuming that there is a unique famous writer that the 
speaker is referring to. I will come back to this type of use of short adjectives in 
84.3. 
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(38) 


Jonas pas save vakarienes pakvieté Zymu ra$ytojq ir seną 

Jonas to his dinner invited famousweak writer and oldweak 
politika. Zymus-is / Zymus rašytojas maloniai priėmė 
politician famous-DEF strong / famoUSweak writer pleasantly accepted 
Jono kvietimą. 

Jonas invitation. 

‘Jonas has invited a famous writer and an old politician for dinner. The 
famous writer pleasantly accepted Jonas’ invitation’ 


Anaphoric expressions can be more general than their antecedents. The more 
general anaphoric definite in German is expressed by the strong article (39) and 
the weak article definite is prohibited. The same behavior is observed in situa- 
tions where the anaphoric phrase is an epithet as in (40). 


(39) 


(40) 


German (Schwarz 2009: 31) 

Maria hat einen Ornithologen ins Seminar eingeladen. Ich halte von 
Maria hasan ornithologist to-the seminar invited I hold of 
dem /#vom Mann nicht sehr viel. 

thestrong / of-theweak man not very much 

"Maria has invited an ornithologist to the seminar. I don't think very 


highly of the man: 


German (Schwarz 2009: 31) 

Hans hat schon wieder angerufen. Ich will von dem /#vom 

Hans has already again called I want of thestrong/ of-theweak 
Idioten nichts mehr hóren. 

idiot not hear 

"Hans has called again. I don't want to hear anything anymore from that 
idiot: 


Similarly, long adjectives can appear with anaphoric nominals that do not com- 
pletely match their antecedents. For example, the proper name Darius in the sec- 
ond mention is referred to as ‘clingy guy’ with the adjective in the long form, 
rather than short as illustrated in (41). Additionally, the long form is also pre- 
ferred over the short one with anaphoric epithets (42). 
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(41) Darius man šiandiena skambino net dešimt kartų. Ikyrus-is / 
Darius me today called eventen times clingy-DEFstrong / 
#ikyrus vaikinas visiškai pamišo. 

clingyweak guy totally went.mad 
‘Darius called me today at least ten times. The clingy guy went totally 
mad 


(42) Darius, būdamas vos penkerių metų, laimėjo matematikos olimpiadą. 


Darius being only five years won math olympiad 
Jaunas-is /#jaunas genijus labai didžiuojasi savo pasiekimais. 
young-DEFstrong / yOoungyeak genius very proud self achievements 


"When being only five years old, Darius won the math olympiad. The 
young genius is very proud of his achievements: 


Lastly, the strong vs. weak distinction can be captured in covarying uses where 
the value of the quantifier determines the value of the definite. German co-vary- 
ing anaphoric uses are incompatible with the weak article and select the strong 
article instead (43). 


(43) German (Schwarz 2009: 33) 
Jedes Mal, wenn ein Onithologe im Seminar einen Vortrag hält, 
every time when an ornithologist in-the seminar a lecture holds 
wollen die Studenten von dem Mann wissen ob Vogelgesang 
want the students of thestrong man know whether bird.singing 
grammatischen Regeln folgt. 
grammatical rules follows 
‘Every time an ornithologist gives a lecture in the seminar, the students 
want to know from the man whether bird songs follow grammatical 
rules. 


Again, the long form adjective seems to be equivalent to the German strong 
article and surfaces in covarying uses as a part of the anaphoric expression (44).° 
In addition, the nominal with short form is felicitous for 12 speakers out of 27. 
Indeed, this context suffices to identify a unique famous artist. The speakers se- 
lecting the short form might be accessing this reading given that the short form, 
as will be demonstrated below, is compatible with uniqueness. 


This example is modeled on the basis of Ingason's (2016: 134) example from Icelandic. 
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(44) Kiekvieng kartą kai kino žvaigždė aplanko mokyklą, studentai 
every time when movie star visits school students 
visuomet klausia žymio-jo / žymaus artisto ar aktoriai 
always ask = famous-DEFgirong / famousweak artist whether actors 
gerai uždirba. 
earn well 
‘Every time a movie star visits the school, students always ask the 
famous artist if actors earn well’ 


To summarize, I have examined the behavior of nominals with short and long 
adjectives in anaphoric environments that include identical and non-identical lin- 
guistic antecedents, more general anaphoric phrases and anaphoric expressions 
in covarying uses. It has been demonstrated that Lithuanian, similarly to Ger- 
man, has one form that functions like a familiar definite, namely the long form 
adjective with the definite suffix -ji(s). Nominals with short form adjectives lack 
anaphoric properties. However, they arise in contexts where there is a possibility 
of a referent to count as being unique. 


4.3 Uniqueness 


The fact that nominals with short adjectives can be indefinite, as illustrated in 
§4.1, is only one part of the story. Gillon & Armoskaite (2015) point out that, de- 
pending on the context, the short form adjectives can also have a definite reading. 
I now investigate this possibility by showing that nominal expressions with short 
forms can occur in situations that are licensed by uniqueness. 


4.3.1 Larger situation environments 


Larger situation environments (Hawkins 1978) license weak definites and permit 
only weak articles in German as illustrated in (45). 


(45) German (Schwarz 2009: 31) 
Der Empfang wurde vom /#von dem Bürgermeister eröffnet. 
The reception was by-theweak / by thestrong mayor opened 
‘The reception was opened by the mayor’ 


Interestingly, both types of adjectives are available in Lithuanian, but are asso- 
ciated with different readings. The nominal with a short form stands for a unique 
individual licensed by general world knowledge as exemplified in (46). (46) is a 
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general rule where following the law the president can fire anyone who occupies 
the role of the new prime minister. 


(46) Praejus dviem savaitém po rinkimy, prezidentas turi teise atleisti 
passed two weeks after elections president has right fire 


naują /#naujq-ji ministrą pirmininką tik išskirtiniais 
NeWweak / N€W-DEF strong minister prime only exceptional 
atvejais. 

cases 


"Two weeks after the election, the president has a right to fire the new 
prime minister only in exceptional cases. 


In contrast, the long form denotes context-specific unique individuals. For ex- 
ample, once the election happened, everyone knows who is the new president. 
Thus, there is a specific unique individual, and to encode such a reading the long 
form is used as in (47). 


(47) Po rinkimy naujas-is /#naujas prezidentas paskambino 
after elections new-DEFstrong / N€Wweak president called 
miestelio merui. 
city mayor 
"After the election, the new president called the city mayor: 


Note that it is not uncommon to encode different types of uniqueness con- 
text by different forms. For instance, Thai makes a distinction between unique 
individuals that are supported by the world knowledge and those that are not 
(Jenks 2015). Generally, Thai provinces elect one Senator and two Ministers of 
Parliament. In (48), the bare noun phrase, generally used for weak definites, de- 
notes a unique senator and this referent is licensed by the world knowledge. To 
encode a reading that distinguishes a unique individual from another individual, 
the demonstrative, typically used for anaphoric references, is used (49). 


(48) Thai (Jenks 2015: 107) 
s5o-woo chian-may (#khon nan) groot mäak 
senator Chiang.Mai cir that angry very 
"Ihe/sthat Senator from Chiang Mai is very angry’ 


(49) Thai (Jenks 2015: 107) 
s59-s39 chian-may #(khon nan) groot maak 
MP. Chiang.Mai cir that angry very 
‘#The/that M.P. from Chiang Mai is very angry: 
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Additionally, unique definite nominals can also be based on social or cultural 
knowledge (Hawkins 1978). Again both forms are possible in Lithuanian yielding 
different interpretations. Lithuanian comparative adjectives occur with the suffix 
-esn-, which is equivalent to the English -er in cases like smarter. Both short and 
long adjectives can have a comparative form. The short form with the compara- 
tive suffix as in (50) refers to a generic set of children that is unique. Nevertheless, 
in contrastive sentences that include a specific unique set of children both forms 
are available (51). 


(50) Mokslo komitetas norėtų, kad mokyklą pradėtų lankyti 
education committee would.want that school begin attend-INF 
jaun-esn-i / #jaun-esn-ie-ji vaikai. 
young-COMP-NOM.M.PLweak / young-COMP-NOM.M.PL-DEF girong children 
"Ihe education committee wants the younger children to start attending 
the school! (adapted from the Internet) 


(51) Jaun-esn-ie-ji / jaun-esn-i vaikai 
young-COMP-NOM.M.PL-DEFstrong / Young-COMP-NOM.M.PLweak children 
Zaide smelio dezeje, o vyr-esn-ie-ji / 
played sand box, while old-COMP-NOM.M.PL-DEFstrong / 
vyr-esn-i vaikai laipiojo medžiais. 
old-COMP-NOM.M.PLweak Children climbed trees. 

"Ihe younger children were playing in the sand box, while the older 
children were climbing the trees: 


4.3.2 Bridging context 


I establish a further distinction between nominals with short and long adjectives 
by exploring bridging contexts (Clark 1975). There are two types of bridging con- 
texts: part-whole and product-producer. The latter licenses the unique definite 
article, whereas the former is associated with the familiar definite. This contrast 
is reflected in German: the weak article is permitted in the part-whole context 
(52) and the strong article is realized in the product-producer environment (53). 


(52) German (Schwarz 2009: 52) 
Der Kühlschrank war so groß, dass der Kürbis problemlos im / 
the fridge was so big that the pumpkin problem in-theyeak / 
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fin dem | Gemüsefach untergebracht werden konnte. 

in thestrong crisper stowed be could 
"Ihe fridge was so big that the pumpkin could easily be stowed in the 
crisper. 


German (Schwarz 2009: 53) 
Das Theaterstück missfiel dem Kritiker so sehr, dass er in seiner 


the play displeased the critic so much that he in his 
Besprechung kein gutes Haar #am /andem Autor ließ. 
review no Good hatt on-theweak / on thestrong author left 


"Ihe play displeased the critic so much that he tore the author to pieces 
in his review: 


Placing the short form adjective in the part-whole environment results in fe- 
licity. In the situation where I am telling my friend for the first time about my 
car breaking down, to refer to the old engine which is part of my car, the short 
form is used (54). This gives additional evidence for the short form being compat- 
ible with situations governed by uniqueness. In contrast, the long form becomes 
acceptable in bridging contexts if the listener has some prior knowledge about 


the old engine from before (55). 


(54) 


(55) 


Context: I am telling my friend for the first time about what happened to 
my car yesterday. My friend has no prior knowledge about the car. 
Vakar sugedo mano automobilis, kurį vairavau ištisus 
yesterday broke.down my car that drove whole 
dešimtmečius! Autoserviso darbuotojai dabar taiso seną /#senq-ji 
decades repairshop employees now fix oldweak/ old-DEFstrong 
varikli. Tikiuosi automobilis ir vėl važiuos puikiai. 

engine hope car and again will.drive well 

‘Yesterday, my car, that I have been driving for entire decades, broke 
down. The mechanics now are changing the old engine. I hope that the 
car will work great again! 


Context: I have told my friend before that my car kept on breaking down 
because the old engine was not working properly. Today, I met my friend 
and told him again about my problems with the old engine. 

Vakar sugedo mano automobilis. Autoserviso darbuotojai dabar 
yesterday broke.down my car repair.shop employees now 
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taiso#senq  / senq-ji variklį. Tikiuosi automobilis ir vėl 

fix  oldweak / Old-DEFstong engine hope car and again 
važiuos puikiai. 

will.drive well 

‘Yesterday, my car broke down. The mechanics now are changing the old 
engine. I hope that the car will work great again! 


If the long form indeed functions like a strong article, it should appear in 
product-producer bridging. This prediction is borne out. Modifying the author 
of the book by a long form yields felicity as in (56). 20 speakers prefered the long 
form, their judgment is illustrated in the example. 7 speakers selected the short 
form. While it is unclear why some speakers use the short form in this context, 
the contrast for the rest of 20 speakers is pretty robust. 


(56) Knyga “Lietus” sulaukė nejtikėtino populiarumo, nepaisant to, kad 
book ‘Rain’ received incredible popularity despite that 
talentingas-is /#talentingas rašytojas nusprendė likti anonimas. 
talented-DEF strong / talentedweak writer decided remain anonymous 
"Ihe book ‘Rain’ became incredibly popular despite the fact that the 
talented writer decided to remain anonymous: 


All in all, the examination of larger situations and bridging contexts provides 
us with some evidence showing that nominals with short form adjectives can 
have a definite reading. Short adjectives resemble weak definites given their ac- 
ceptability in part-whole bridging contexts and larger situations based on gen- 
eral world knowledge. The fact that nominals with long adjectives are allowed 
in larger situations, but do not emerge in part-whole bridging contexts tell us 
that this form lacks the properties of a true weak article definite. While a precise 
characterization of the conditions that govern the use of long forms in larger 
situations requires further research, it is rather intriguing that the similar split 
within this environment also exists in numeral classifier languages like Thai. 


4.4 Section summary and implications 


To summarize this section, I have provided additional arguments that nominals 
with long form adjectives lack indefinite uses and indeed function like definites 
as has been suggested by Gillon & Armoskaite (2015). Specifically, using different 
familiarity environments and product-producer bridging contexts, it was demon- 
strated that nominals with long form adjectives resemble German nominals with 
the strong article licensed by familiarity. Furthermore, while nominals with short 
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adjectives seem to be unmarked for definiteness, as noted by Ambrazas et al. 
(1997), definite contexts were presented that trigger the occurrence of the short 
form. The nominals with short form adjectives surface in part-whole bridging 
contexts and larger situations based on general world knowledge, and thereby 
function like weak definites. 

Given that I argued for the presence of the two adjective forms in Lithuanian 
that occur in definite environments, an anonymous reviewer asks what the basic 
structure of a Lithuanian noun phrase would be. Indeed, these findings provide 
important implications for how the structure of a noun phrase could look like. 
Following Gillon & Armoskaite (2015), I assume that definite phrases in Lithua- 
nian involve a D layer. The long form, which is the short form plus the definite 
suffix -ji(s) expresses anaphoricity. I take the D head to be -ji(s).? Recall that short 
form is compatible with uniqueness, which suggests that in those cases there also 
should be a D head, but it is not overtly expressed. Therefore, the D head can be 
encoded either by the suffix -ji(s) or be marked as null as illustrated in (57). 


(57) The basic structure of Lithuanian definite nominals 


DP 
P icm 
D AP 
| guo. 
Jji(s)O A NP 


grazus lokys 
‘beautiful’ bear’ 


?Note that the suffixation of the definite morpheme is subject to local adjacency. The suffix 
cannot be realized on the adjective if there is an adverb intervening between the D head and 
the noun as shown in (i). 


(i) a. grazus-is lokys 
beautiful-DEr bear 
‘the beautiful bear’ 
b. * labai grazus-is lokys 
very beautiful-DEr bear 
“the very beautiful bear’ 
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5 Conclusion 


This paper has intended to show that the distribution of short and long form ad- 
jectives in Lithuanian supports Schwarz's (2009; 2013) claim that there exist two 
types of definites: familiar definites and unique definites. The detailed analysis of 
nominals with two kinds of adjectives has revealed interesting parallels between 
two distinct languages, Lithuanian and German. Lithuanian, similarly to Ger- 
man, can use two forms to encode definiteness: long form adjective are compat- 
ible with familiarity and short from adjectives are compatible with uniqueness. 
This distinction emphasizes the need to adopt the hybrid approach that includes 
both familiarity and uniqueness for the analysis of definite uses. The reality of 
strong vs. weak distinction is supported further by identifying genetically unre- 
lated languages that uses similar means to encode this distinction. Lithuanian 
patterns with languages like Akan and Thai since it uses a bare form, the short 
adjective, for uniqueness and it has one marked form, namely the long adjective, 
that is equivalent to the strong article in German. 

Long and short form demonstratives are also distinguished in Lithuanian. Fur- 
ther research would be to see what the nature of the definite interpretation of 
these forms is, and how this can be related to short vs. long adjective variations 
in Slavic. 
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Chapter 4 


On (in)definite expressions in American 
Sign Language 


Ava Irani 


University of Pennsylvania 


This paper provides an analysis of the properties and distribution of the pointing 
sign 1x and bare NPs in American Sign Language. I argue that 1x followed by an NP 
when referring to a previously established locus is a strong definite article along the 
lines of Schwarz (2009; 2013). This claim goes contra previous analyses that draw 
parallels between 1x and demonstratives (Koulidobrova & Lillo-Martin 2016). The 
data presented here also show that both bare NPs and 1x+NPs double as definites 
and indefinites, which suggests that definiteness is not semantically encoded in the 
language. I further illustrate that the interaction of the use of bare NPs and 1x+NPs 
indicates that the specification of a locus has an impact on the interpretation of an 
expression as being definite or indefinite. An ıx+NP cannot refer back to a bare NP 
in the discourse due to the underspecification of a locus feature that characterizes 
bare NPs. These findings allow me to reanalyze the properties of the two kinds of 
nominals in the language. 


1 Introduction 


Definite and indefinite expressions in natural language are two widespread com- 
ponents of communication. Despite their ubiquitous presence, the way in which 
each language conveys these expressions can vary. For instance, English indefi- 
nites are typically viewed as being introduced by the article a, while the precedes 
definite NPs. The distinction does not stop there. Schwarz (2009) observes that 
languages can further divide categories of definite expressions into those that 
encode uniqueness and those that are anaphoric and familiar. There are also lan- 
guages like Hindi, which lack overt determiners altogether. These types of lan- 
guages have ensued a claim that their bare nominal expressions lack a DP layer, 
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as they do not encode pure indefinite readings (Dayal 2004). And finally, there 
has been a plethora of research at least since the late 1800s on the properties of 
definite and indefinite expressions in discourse (Frege 1892; Russell 1905; Kamp 
1981; Heim 1982, i.a.). In this paper, I investigate a language that contributes to the 
discussion on definiteness in varying respects, while simultaneously allowing us 
to examine natural language expressed via a different modality. 

American Sign Language (ASL) is generally claimed to be a language without 
overt determiners, but it signifies the relationship between nominal expressions 
in more than one manner. Nominal phrases can be expressed as bare NPs, or they 
can also be set up at locations in signing space through the use of loci. A language 
with more than one way of conveying nominals introduces another dimension 
in the goal to understand the realization of definite and indefinite reference in 
language. 

Sign languages have been of interest in examining various linguistic phenom- 
ena due to their use of a different medium of communication and the visibility 
that signs provide to language through the use of this modality. Despite sign 
language research gaining momentum since Stokoe’s initial work in the 1960s, 
much work is left to be done in terms of thoroughly describing fundamental as- 
pects of these languages. This paper aims to deepen our knowledge of the array 
of possible alternatives through which definite and indefinite referents can be 
expressed. 

Although recent work has shown interest in definite NPs in ASL, there has 
been some disagreement in the literature in determining their status (Bahan et 
al. 1995; Koulidobrova & Lillo-Martin 2016). Definiteness in ASL has been said 
to be expressed via the index marker, glossed as 1x! (Bahan et al. 1995), despite 
indexing and 1x having been described as performing multiple functions (e.g. 
Lillo-Martin & Klima 1990). In the sections to follow, I discuss the nature of defi- 
niteness, and explicate the behavior of 1x in definite environments. My proposal 
is compatible with the analysis of loci as being composed of morpho-syntactic 
features. Previous work has focused on loci as overt manifestations of indices 
(Lillo-Martin & Klima 1990; Schlenker 2010). The analysis argued for here fol- 
lows that line of work, while also focusing on bare NPs introducing indices. I 
show that ASL has two types of indices: one type that is introduced by NPs spec- 
ified for a locus, and the other set of indices introduced by bare NPs, which are 
underspecified for loci. The interaction of these systems has consequences for 


‘Throughout this paper, I refer to the pointing sign, i.e. the index marker, in ASL as 1x. When 
referencing indices or an index, I am referring to the formal semantic indices introduced by 
NPs in the discourse. 
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the definite or indefinite interpretation of expressions. My proposal that loci are 
composed of features is motivated by previous work on locus re-use (Kuhn 2015), 
but follows Schlenker (2016) in adopting the featural variable view of loci, which 
ties in with my claims about definiteness in the language. 

The ASL judgments provided in this paper are from three native signers who 
have been exposed to the language from birth. The consultants were presented 
with the target ASL sentences in the target language, and asked for grammatical- 
ity judgments and whether or not any particular construction was felicitous in 
ASL. They were also asked to provide the possible interpretations of each data 
point. Judgment reports of the data were preferred over examining data from 
more naturalistic sources such as corpora for two reasons: i) the circumstances 
in which the particular kinds of examples investigated in this paper would be 
found naturally occur infrequently, and ii) corpora do not allow for a study of 
infelicitous linguistic environments, which are crucial to the central idea of the 
proposal. It cannot be certain whether a construction that occurred with low fre- 
quency in a corpus is impossible in a given language or whether the opportunity 
to use it was simply not present. 

This paper is structured as follows: first, I present an overview of previous 
work on definiteness in ASL, which focuses on the use of the index marker ıx. 
Next, I take what has been previously discussed on 1x and reanalyze it to draw 
parallels between 1x and the two types of definite articles noted for numeral 
classifier languages (Jenks 2015). Even though 1x can be seen as a strong definite 
article in the sense of Schwarz (2009), I will argue that ASL does not canonically 
encode definiteness lexically. Instead, there appears to be a more pragmatic force 
involved. ıx+NP can be a definite or indefinite expression depending on whether 
it refers to another already introduced 1x+NP at the same locus. 


2 Background 


The subsections below first discuss the general properties of 1x when introducing 
loci in order to set the stage for developing an analysis of 1x. I then present 
arguments for analyzing IX as a demonstrative (to be rejected). This background 
will be beneficial in discussing the behavior of 1x and indefinites in the language. 
I first provide a description of some commonly known uses of loci; then, I present 
and jettison previous work on 1x that argues for it as a demonstrative. Finally, I 
show that 1x behaves differently when it is referring to a previously established 
locus, as opposed to when it is not. 
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2.1 Loci 


Before diving into the details of previous analyses of 1x, one must first under- 
stand its typical uses. A common use of the index marker is to make reference 
to entities. When an entity is first introduced in the discourse, the index (1x) can 
be used to establish a locus for the entity, which can later be referred to in the 
discourse (Klima & Bellugi 1979; Lillo-Martin & Klima 1990). By establishing a 
locus as the point of reference, the signer can simply point back with 1x to the 
locus to refer back to the entity that was previously introduced. (1) is an example 
of such a use of ıx.? 


(1) IXa SARA, IXp STACY} 4BOTHyj FRIENDS. IX, LIKES IXp.° 
‘Sara; and Stacy; are friends. She; likes ber: 


The sentence above illustrates how each locus is associated with an entity. In 
(1), locus a is associated with sara while locus b is associated with stacy. (2) 
fleshes out the paradigm of loci uses. The examples also show that loci typically 
refer to the entities set up at that location. 


(2 a. WHEN IX, SOMEONE, LIVE WITH IX} SOMEONE}, 
"When someone lives with someone; 


b. IX, LOVE IXp. 
‘the former loves the latter/^ (adapted from Schlenker 2010: 13) 


As seen in (2), the loci retain their referents, giving a meaning that can be 
translated as ‘the former’ and ‘the latter’ in English. Moreover, in addition to 
entities, IX can also be used to refer to VPs. 


? Any examples without citation are elicited from my own fieldwork with native ASL signers. 

3Signs are glossed in small capital letters as is standard in the literature. Loci are uniformly 
indicated with 1x and a subscript both on 1x itself and the nominal that follows. All cited 
examples have been adapted to fit this format. 

^When the loci refer to the same signing space as below, they are infelicitous: 


(i) a. #IX, LOVE IX, 
‘the former loves the former. 


b. # 1X, LOVE IX, 
‘the latter loves the latter’ 


The reason for the unacceptability of these judgments results from standard assumptions 


about binding theory (Reinhart & Reuland 1993) and from the special reflexive morphology 
that is required for ASL in these cases (Meir 1998). 
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(3) IX, GET, JOB, DISJ/shift Gr, GO} GRADUATE-SCHOOLp. Da I CAN IXp 
IMPOSSIBLE. 
‘Get a job or go to graduate school? The former I can do, but the latter is 
impossible’ (Koulidobrova & Lillo-Martin 2016: 226) 


The example in (3) shows that the use of 1x is not restricted to entities. Once 
loci are established, one can use 1x as many times as necessary in the discourse 
to refer back to the entity or proposition assigned at the locus. 


2.2 Previous work 


The most recent work on 1x has argued for it to be a demonstrative (Koulidobrova 
& Lillo-Martin 2016), as opposed to a definite article (Bahan et al. 1995). Although 
in this paper I show evidence in favor of 1x as a definite article, I first present 
parts of Koulidobrova & Lillo-Martin's analysis in order to discuss patterns in 
the language that my analysis aims to capture. 

Koulidobrova & Lillo-Martin (2016) base their argument on the assumption 
that definite articles are licensed by uniqueness; however, the use of 1x appears 
to be infelicitous in these instances. 


(4) FRANCE (#IX,) CAPITAL, WHAT. 
"What is the capital of France?' (Koulidobrova & Lillo-Martin 2016: 234) 


(5) TODAY SUNDAY. DO-DO? GO CHURCH, SEE (#IX,) PRIEST . 
"Today is Sunday. What to do? UU go to church, see the priest’ 
(Koulidobrova & Lillo-Martin 2016: 234) 


The above examples show that 1x is not licensed by uniqueness. Although there 
is only one capital of France, 1x in (4) is ungrammatical. Similarly, (5) disallows 
IX with PRIEST even when referring to a single priest in a church. This point will 
become relevant in the following sections when I propose my analysis. For now, 
I simply note that bare NPs are required in these uniqueness situations. 

Another common use of definite articles in many languages is an anaphoric 
one. When 1x+NP is not referring to a locus that has been previously established 
in signing space, it is unacceptable in anaphoric environments. 


(6) TODAY SUNDAY. DO-DO. GO CHURCH, SEE PRIEST. (#IX,) PRIEST, NICE. 
"Today is Sunday. What to do? UU go to church, see the priest. The priest 
is nice. (Koulidobrova & Lillo-Martin 2016: 234) 
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In (6), 1x is infelicitous with the second instance of PRIEST even when its first 
mention is present in the discourse. The inability of 1x to appear in these cases 
can be explained under their account of 1x being a demonstrative, since demon- 
stratives are not licensed without a contrastive reading or a kind of demonstra- 
tion. Based on the above examples with uniqueness and anaphoricity, it might 
be tempting to label the index marker as a demonstrative; however, in further 
sections, I show that although there are some similarities between rx and demon- 
stratives, there are also differences between them. In foreshadowing the analysis 
described in this paper, I note that 1x here attempts to make reference to a refer- 
ent introduced by a bare NP, and not a referent that was previously established at 
a locus. I show in the following sections that the anaphoric cases of 1x are indeed 
felicitous when referring to a previously mentioned NP with an associated locus. 
Moreover, I argue that 1x when referring to previously used loci is best analyzed 
as a strong article definite along the lines of Schwarz (2009; 2013) 


3 Two types of definites in ASL 


This section presents the two types of definite articles described by Schwarz, the 
strong definite article and the weak definite article, which occur cross-linguisti- 
cally. I argue here that the ASL index preceding an NP when referring to previ- 
ously introduced loci, patterns with the strong definite article. 1x is also shown 
to behave unlike other demonstratives in the language, which is additional evi- 
dence for the strong article definite analysis. Weak article definites are argued 
to be expressed by bare NPs, similar to the kind noted for numeral classifier lan- 
guages (e.g. Jenks 2015). 


3.1 Two types of definites cross-linguistically 


Schwarz (2009; 2013) has observed two types of definite articles that are found in 
a host of unrelated languages: strong definite articles, which encode familiarity 
and anaphoricity, and weak definite articles, which encode uniqueness. Before 
diving into the properties of these two kinds of definite articles, let me first con- 
sider some typical uses of definiteness in natural language. The following are 
some examples from Hawkins (1978) modelled after Schwarz (2009): 


7) Anaphoric use 
(7) p 
john bought a book and a magazine. The book was expensive. 


(8) Immediate situation 
the table (uttered in a room with exactly one table) 
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(9) Larger situation 
the president (uttered in the US) 


(10) Bridging (Clark 1975) 
a. John bought a book. The author is French. 


b. John’s hands were freezing as he was driving down the street. 
The steering wheel was bitterly cold and he had forgotten his gloves. 


The examples in (7-10) indicate the various flavors in which definites can ap- 
pear. (7) describes a use of definites that requires referring back to an already 
introduced linguistic referent in the discourse. As shown in (8) and (9), the defi- 
nite NP does not need a linguistic antecedent; it can also refer to a salient entity 
in the environment. Similarly, (10) presents examples that can refer to a relation 
between the definite NP and its antecedent. (10a) illustrates a product-producer 
bridging relationship between the book and the author, while (10b) shows a part- 
whole relationship between the car described by the driving event and the steer- 
ing wheel. The different types of definiteness here are relevant for the discussions 
to follow. 

The definite expressions above appear in two forms across languages. They are 
divided along the lines of definite articles that denote familiarity or uniqueness 
(Schwarz 2009; 2013). They are coined the strong article definite and the weak 
article definite respectively. The following is an instance of an environment in 
which a weak article definite is licensed? 


(11) Context: There is only one blackboard in the classroom and 
the professor says: 
I won't be using the blackboard today. 


The definite article the is felicitous in the example above even though a referent 
has not been previously introduced. The presence of a unique blackboard in the 
classroom is sufficient to make the use of the definite article possible. Part-whole 
bridging is another situation in which weak definite articles are licensed. 


(12) The police stopped the car because the rear-view mirror was broken. 


In the example above, the rear-view mirror is a part of the car, and hence, 
the relationship between them is said to be part-whole. These cases also encode 


"English lacks the strong and weak article definite distinction; I use the examples here for purely 
expository purposes. 
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uniqueness, and languages that show a distinction between the two types of def- 
inite articles employ a weak article definite here. 

Strong definite articles, on the other hand, are based on familiarity — i.e. they 
are linked anaphorically to an antecedent. (13) illustrates definite articles in 
strong environments. 


(13) I bought a book. The book was interesting. 


The definite article in (13) is used with the second occurrence of book. This 
usage is licensed by the presence of a contextually salient linguistic referent in 
the first sentence, which, in this instance, is an indefinite expression. Languages 
with both types of articles use a distinct strong article definite in these familiarity 
cases. 

This distinction was first observed in German (Heinrichs 1954; Hartmann 1982; 
Schwarz 2009; i.a.), which evokes two overt forms of a definite marker to indicate 
the two types of definiteness. 


(14) German (Schwarz 2009: 52) 
Der Kühlschrank war so groß, dass der Kürbis problemlos 
the fridge wassobig that the pumpkin without-a-problem 
im / tin dem Gemüsefach untergebracht werden konnte. 
in-theweak / in thestrong crisper stowed be could 
‘The fridge was so big that the pumpkin could easily be stowed in the 
crisper? 

(15) German (Schwarz 2009: 53) 
Das Theaterstück missfiel dem Kritiker so sehr, dass er in seiner 
the play displeased the critic so much that he in his 
Besprechung kein gutes Haar #am /andem ` Autor ließ. 
review no good hair on-theweak / on thestrong author left 
"Ihe play displeased the critic so much that he tore the author to pieces in 
his review: 


Although two forms of the definite marker are available, German obligatorily 
requires the contracted version in (14) and the uncontracted version in (15). These 
facts arise due to the type of bridging relations: (14) includes a part-whole rela- 
tion, a weak article definite environment, while (15) includes a product-producer 
one, a strong article definite environment. With these German facts in place, 
I will now examine how the distinction plays out in other languages. Akan, a 
Niger-Congo language, shows a strikingly similar pattern of definiteness: 
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(16) Akan (Arkoh & Matthewson 2013: 39) 


Ámstrón ` nyí nyímpá aa ó-dzí-i kán tu-u k5-5 dsiran 
Armstrong is person REL 3sG.sBJ-eat-PST first uproot-PST go-PST moon 
dá. 
top 


"Armstrong was the first person to fly to the moon: 


(17) Akan (Arkoh & Matthewson 2013: 52) 
Ámá tó-ó nsá freë | nnómáhwéfó bi bá-à nkyrékyíré 
Ama throw-pst hand call-pst birds.observer REF came-PsT teaching.NOM 
naasi. Mi-n-gyi pàpá nó ‘n-dzí kitsikitsi. 
POSS.under 1sc.subject-NEG-take man FAM NEG-eat small.RED 
‘Ama invited a (certain) ornithologist to the seminar. I don’t trust the man 
in the least. 


Exactly like what was observed for German strong article definite, the Akan 
familiarity marker nó must occur in strong article definite environments. (16), in 
contrast, refers to a unique moon which does not license the familiarity marker, 
and unlike German, the weak article definite is expressed as a bare NP. Thai, 
a numeral classifier language, also does not license a definite marker in weak 
article definite cases, and a bare NP is used instead. The Thai example below 
patterns exactly like the Akan case in (16) that encodes uniqueness. 


(18) Thai (Jenks 2015: 7) 
rot khan nan thuuk tamrüatsakat = phró? | máj.dàj tit satikaa 
car CLF that ADv.psT police intercept because NEG attach sticker 
wáj thii thábian (#baj nan) 
keep at license cır that 
‘That car was stopped by police because there was no sticker on the 
license: 


The part-whole relation between the sticker and the car results in a weak ar- 
ticle definite environment, where a bare NP is used. However, anaphoricity li- 
censes the obligatory presence of a classifier, which is argued to be the strong 
definite article in Thai (Jenks 2015). 


(19) Thai (Jenks 2015: 7) 


fool khít waa klon bot nan prs? maak mée-wáa khaw ca maj 
Paul thinks comp poem crr that melodious very although 3 IRR NEG 
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cháop naktéenkloon #khon nan) 
like poet cLF that 
‘Paul thinks that poem is beautiful, though he doesn't really like the poet: 


Now that I have discussed the patterns to be expected of strong and weak 
definite articles across languages, I can examine the occurrences of the ASL ıx 
in exactly these circumstances. In the following section, I apply the above tests 
to 1x in ASL and show that it indeed behaves like a strong definite article. 


3.2 1X as a strong definite article 


Previous work (Koulidobrova & Lillo-Martin 2016) has claimed that ıx is a demon- 
strative as it apparently fails to occur felicitously in definite environments and 
displays behavior typically expected of demonstratives. In this section, I address 
the first part of the argument and show that 1x is obligatorily used in strong def- 
inite environments when referring to loci already established in the discourse, 
thus indicating that ıx can play the role of a strong definite article. 

It has been claimed that 1x cannot occur in certain definite environments, like 
in (6) repeated below as (20): 


(20) TODAY SUNDAY. DO-DO. GO CHURCH, SEE PRIEST. (#IX,) PRIEST, NICE. 
"Today is Sunday. What to do? UU go to church, see the priest. The priest 
is nice. (adapted from Koulidobrova & Lillo-Martin 2016: 234) 


The example above suggests that ıx with an NP cannot have a bare NP as 
its antecedent, but it is not informative regarding the overall status of 1x or its 
interpretation in the given utterance. As stated earlier, 1x can be used as a locus 
to establish referents in signing space. Once a locus for 1x has been introduced, 
a different pattern emerges. This is illustrated in (21) below: 


(21) JOHN BUY IX, MAGAZINE;, IXp BOOKp. IXp BOOK} EXPENSIVE. 
‘John bought a magazine and a book. The book was expensive’ 


The occurrence of 1x in (21) is surprising if it were a demonstrative. For in- 
stance, English does not permit demonstratives in these anaphoric cases. 


"De Sá et al. (2012) find a morphosyntactic distinction between strong and weak definites in 
Brazilian Sign Language (Libras). However, this distinction follows Carlson & Sussman's (2005) 
line of work where weak definites in instances such as John went to the store do not have a 
uniqueness requirement. I will not discuss this work any further, but the reader is referred to 
Carlson & Sussman (2005) and Carlson et al. (2006) for more detail. The relevant distinction 
in the definiteness domain here is that based on familiarity and uniqueness between what 
Schwarz (2009) calls the strong article definite and weak article definite. 
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(22) John bought a book and a magazine. The/& That book was expensive. 

In addition to these examples where 1x is possible in environments that only 
permit definite articles and not demonstratives, 1x also occurs in instances of 
product-producer bridging. 


(23) JOHN BUY IX, BOOK,. #(IX,) AUTHOR, SELF FRENCH. 
‘John bought a book. The author is French"? 


The examples in (21) and (23) are parallel to the German, Akan, and Thai cases 
seen earlier. Anaphoricity licenses the occurrences of 1x, which is exactly true for 
the strong definite article. Moreover, it is non-trivial for an 1x as a demonstrative 
approach that the index is possible above. Although definite articles are possible 
in the environment in (23), demonstratives are not, as seen from English in (24). 


(24) John bought a book. The/#That author is French. 


This section served to illustrate three things. First, bare NPs cannot serve as an- 
tecedents for 1x.? Second, 1x is possible in definite environments when referring 
back to previously established loci and patterns with the strong definite article. 
And third, 1x can appear in environments where demonstratives are infelicitous. 
The following section elaborates on this last point. 


3.3 IX versus demonstratives 


I have provided evidence for 1x as a strong definite article, but in this section, I 
also present arguments for 1x behaving distinctly from demonstratives. ASL is al- 
ready known to have a demonstrative THAT in the language, which is signed with 


"This sentence becomes more acceptable if that is pronounced with some exclamation. This 
gives the utterance an emphatic meaning. On the other hand, this emotive reading is not as 
available if the predicate was relatively more mundane; for instance, John bought a magazine 
and a book. That book was red. is much worse than a definite article use even with an emphasis 
on that. 

"The possessive in ASL has a different form, the (flat) B handshape. The example here does not 
indicate a possessive like book's author since the index finger with the 1 handshape is used 
instead, without the NP book. 

?A reviewer asks whether it is too strong a claim to argue that ıx+NP cannot refer back to 
bare NPs. The consultants whose judgments are reported here did not allow it. However, it is 
possible that some variation can be found in this area. For instance, Sereikaite (2019) (in this 
volume) finds variation in the product-producer bridging cases in Lithuanian. 
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a Y handshape.!° Therefore, an easy test for the 1x as a demonstrative hypothesis 
is to place 1x in the same environment as THAT and observe their behavior. This 
sign was not examined by Koulidobrova & Lillo-Martin (2016) in their investiga- 
tion of Ix. 

Although demonstratives and definite articles both contain presuppositions 
of familiarity and uniqueness, demonstratives carry with them an accompanying 
demonstration (Roberts 2002). It is a known property of demonstratives that they 
enforce a contrastive reading. This property renders sentences like the following 
infelicitous with that: 


(25) A car drove by. The/#That horn was honking loudly. (Wolter 2006: 70) 
(26) Imeta doctor and a banker. The/ That banker was full of himself. 


The sentences above are infelicitous with the demonstrative due to the lack of 
a contrastive reading. On the other hand, I have already shown that a sentence 
like (26) in ASL permits 1x, which would be surprising if 1x is a demonstrative 
that requires a contrastive interpretation. The example in (21) is repeated below 
in (27). 


(27) JOHN BUY IX, MAGAZINE, IX} BOOK}. IX BOOK} EXPENSIVE. 
‘John bought a magazine and a book. The book was expensive’ 


The counterpart of the sentence with the demonstrative THAT, however, is 
infelicitous. 


(28) JOHN BUY IX, MAGAZINE,, IXp BOOK}. #THAT} BOOK} EXPENSIVE. 
‘John bought a magazine and a book. The book was expensive’ 


Even when THAT is signed aligned with the locus associated with the book, the 
demonstrative in this anaphoric situation is unavailable. Another situation where 
demonstratives and definite articles can be distinguished is when referring to a 
contextually salient referent out of the blue. Firstly, Inote that it is not essential 
that demonstratives require physical pointing to the referent, as it is neither a 
sufficient nor a necessary condition. 


(29) Context: Policeman, pointing in the direction of a man running through a 
crowd: 
Stop that man! (Roberts 2002: 121) 


The sign THAT is also used as a relative pronoun, but other than bearing the same phonolog- 
ical realization as the demonstrative, it is unclear that the two usages show any syntactic or 
semantic overlap. 
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The example above from Roberts (2002) describes a situation in which a police- 
man is chasing a man through a crowd of several people. It is not obvious who he 
is pointing to, but the context makes the referent clear. A deictic gesture is also 
unnecessary in making out the discourse referent. Roberts describes a situation 
in which two friends are sitting in a coffee shop when a man enters and begins 
to noisily harass the employee behind the counter. In this case, without pointing 
and drawing attention to herself, one friend can say to the other: 


(30) That guy is really obnoxious. (Roberts 2002: 121) 


Such an example can be tested in ASL as well. Demonstratives are expected to 
be possible in this environment, but definite articles are predicted to be infelici- 
tous. 


(31) [out of the blue] (#1x-neu) MAN ANNOYING. 
"Ihat man's annoying: 

(32) [out of the blue] THAT-neu MAN ANNOYING. 
"Ihat man's annoying: 


Example (31) shows that 1x pointing to a neutral location!! cannot be used to 
refer to the contextually salient individual. I show this example with a neutral 
point in order to avoid any confound of assigning an arbitrary locus to an indi- 
vidual present in the environment; under normal circumstances, one would use 
a deictic locus in these cases. Even with a neutral point before MAN, the utterance 
is infelicitous. However, the same statement becomes acceptable with THAT or 
even as a bare NP. The use of the bare NP in (31) becomes relevant in the dis- 
cussion on weak definite articles; for the present argument, I am only concerned 
with the contrast between (31) and (32). The situation described here is perfectly 
acceptable with the demonstrative THAT. It is evident that the two signs THAT 
and 1x pattern differently, and furthermore, THAT in ASL behaves just like that 
in English. 

The instances of ASL THAT, 1x, and the English that presented in this section 
force me to conclude that ıx does not have much in common with the English 
that, and moreover, it does not align with the theory of demonstratives adopted 
here. In contrast, I find that THAT in ASL and that in English behave alike in the 
situations presented in this section. 


"I do not make any claims in regards to 1x in neutral position and its featural specifications. I 
am simply pointing out here that ıx-neu MAN is prohibited in this case due to the presence of 
a salient individual. 
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Up to this point, I have presented arguments for a strong definite article in 
ASL. Its counterpart, the weak definite article, also exists in the language. The 
next section argues that bare NPs can play the role of weak article definites. 


3.4 Bare NPs as weak article definites 


In the previous two sections, I have provided evidence that the ASL index 1x 
behaves like the strong definite article as opposed to a demonstrative. Here, I 
discuss evidence for the presence of weak article definites in the language. 

If one recalls the examples from German, Thai, and Akan, weak definite articles 
can appear across languages in two varieties: overtly or as a bare NP. I have 
already argued that 1x in ASL is a strong definite article, and by examining bare 
NPs, I find that they behave like weak definite articles similar to those in Thai 
and Akan. (33) and (34) illustrate this. 


(33) FRANCE (#IX,) CAPITAL, WHAT 
"What is the capital of France?' (Koulidobrova & Lillo-Martin 2016: 234) 


(34) TODAY SUNDAY. DO-DO? GO CHURCH, SEE (#IX,) PRIEST, 
"Today is Sunday. What to do? ll go to church, see the priest: 
(Koulidobrova & Lillo-Martin 2016: 234) 


The sentences in (4) and (5) from Koulidobrova & Lillo-Martin (2016) are re- 
peated above in (33) and (34) respectively. These examples were aimed at indi- 
cating the incompatibility of 1x with unique NPs. In (33), 1x is impossible even 
though there is only one capital of the country. Similarly, in (34), using 1x with 
the NP PRIEST is unacceptable even when there is a unique priest at the church. 
The infelicity of these cases is expected if weak article definites have to be ex- 
pressed by bare NPs.!? 


4 Reanalyzing 1x 


Now that I have established 1x as a strong article definite when it refers to pre- 
viously established loci and bare NPs as weak definite articles, I can proceed to 
lay out the precise nature of definiteness in ASL in relation to 1x, loci, and bare 


Tn 85, I present examples of where uniqueness restrictions on Ix are not as strong. These are 
cases with two unique referents in the discourse. Such examples warrant further investigation, 
but they do not detract from the argument here, which indicates that under general circum- 
stances, unique referents are unable to be associated with a locus. Moreover, the reason behind 
the prohibition of 1x in these cases is still not an artifact of 1x as a demonstrative. 
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NPs. The present analysis also leads to the question of why bare NPs cannot 
serve as antecedents to ASL strong definite articles. I address that question in 
this section. 

The key difference between the weak and strong definite articles manifests 
itself in the presence or absence of an extra individual argument and identity 
relation. This difference is encoded in the definitions of the weak and strong 
definite articles below, as formulated by Schwarz (2009). 


(35) Weak definite article 
ASAP. e s. 3!xP(x)(sr).ıx.P(x)(s,) (Schwarz 2009: 148) 


(36) Strong definite article 
a. As, AP.Aysa!x(P(x)(s,) & x = y).ix[P(x)(s,) & x = y] 
b. [pp! [[the s,] NP]] 
c. [[36b]]8 = ix. NP(x)(s,) & x = g(1) (Schwarz 2009: 260) 


In the formulations above, s, represents resource situation pronouns in DPs, 
which is essentially a variant of a standard indexed variable (Schwarz 2009: 95). 
The difference between the two types of articles is that the weak article definite 
does not contain an individual argument. The strong definite article, on the other 
hand, is made up of the weak definite article, which expresses situational unique- 
ness, and has a phonologically null pronominal element - the anaphoric index 
argument - built into it (Schwarz 2009: 258). I adopt the above representations of 
the weak and strong definite articles for ıx+NPs and bare NPs, as their properties 
align with the aforementioned distinctions. As per the discussion, weak article 
definites do not generally introduce an index, but under my proposal, I will show 
that both bare NPs and 1x+NPs can introduce indices. The data presented in this 
paper do not allow to make a claim regarding the introduction of indices for weak 
article definites more generally, although it is possible that they exhibit different 
behaviors when the conditions for the weak article definite are met. 


BSome sign languages have been noted to express definiteness via non-manual markers. For 
example, a wrinkled nose co-articulated with an NP in Russian Sign Language and in the Sign 
Language of the Netherlands signals a known discourse referent (Kimmelman 2015). The use 
of non-manual markers to convey definiteness has yet to be observed in ASL. However, future 
work would benefit from examining the potential role of non-manual markers or the location of 
the referent in signing space. The latter has been noted to play a role in Catalan Sign Language 
(Barberà 2014). Thanks are due to an anonymous reviewer for bringing cross-linguistic work 
on definiteness and non-manual marking to my attention. 
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Bare NPs in ASL, moreover, are ambiguous between definites and indefinites. 
Similar to bare NPs, 1x+NPs in ASL double as indefinite and definite expressions. 
These facts lead us back to wonder why indefinite bare NPs cannot serve as an- 
tecedents for the strong definite article. In order to answer this question, I first 
show in the consequent sections that both bare NPs and 1x+NP have a bona fide 
indefinite reading. Then I discuss the properties of the strong article definite that 
require an antecedent which has been introduced through a locus. Bare NPs can- 
not serve as antecedents to 1x+NPs precisely because they are not specified at 
a locus. I propose that bare NPs are underspecified for a locus feature, which 
creates a discordance between the two nominal types in the discourse due to the 
types of indices they introduce. §4.2 provides evidence and expands on this idea. 
Support for my argument that 1x is composed of features comes from work show- 
ing that features on loci can be uninterpreted under focus (Kuhn 2015), which I 
discuss in §4.3. In order to account for all the patterns I inspect in this paper, I 
follow Schlenker (2016) in adopting a featural variable analysis of loci. 


4.1 ASL indefinites 


I provide evidence below for both bare NPs and 1x+NPs as also having true in- 
definite readings. ASL is a determinerless language, and it has been argued that 
such languages lack a true indefinite interpretation (Dayal 2004). Hindi has been 
shown to fit this description, however, I illustrate that ASL and Hindi diverge in 
this respect." 

Bare NPs in ASL are ambiguous between definites and indefinites. I have al- 
ready shown definite readings of ASL bare NPs, and I can apply standard diag- 
nostics to test their behavior as indefinites. In this section, I take a look at narrow 
scope indefinite readings of bare NPs in subject position to illustrate that bare 
NPs can have a true indefinite reading. Moreover, 1x+NPs can also have such an 
interpretation, a fact illustrated through their use in donkey sentences. 

Hindi, a language without overt determiners, has been argued by Dayal (2004) 
as having bare NPs that lack a pure indefinite reading. Consider the sentence 
below: 


(37) ` Hindi (adapted from Dayal 2004: 406) 
# Charon taraf baccha khel rahaa — thaa. 
four ways child play PRoc.sc be.sc.PsT 
‘A (different) child was playing everywhere: 


“Tf true, this claim would be in contrast to Dayal (2004), who argues that bare NP languages 
without determiners do not have a pure indefinite reading. 
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Baccha 'child' in the sentence in (37) above cannot have the interpretation 
where a different child is playing everywhere; the only reading available is that 
of a single child. This fact does not hold in ASL. The following example illustrates 
that ASL and Hindi must be analyzed differently, as bare NPs in subject position 
in the language can be interpreted with a narrow scope indefinite reading. 


(38) CHILD PLAY EVERYWHERE. 
'Same child/a different child was playing everywhere: 


The example in (38) can either have the reading where only one child is play- 
ing everywhere, or the reading where different children are present. If a narrow 
scope indefinite reading were impossible, then only the former interpretation 
would be expected. ASL bare NPs have passed this test for indefinite readings. 
The example in (38) is similar to English (39), a language with overt determiners, 
in this respect. 


(39 A child was playing everywhere. 


As the English example illustrates, a narrow scope indefinite reading is possi- 
ble with a child, where both interpretations of a single child or different children 
are available. ASL and English do not appear to differ in this regard, and it seems 
that bare NPs in ASL pattern with English indefinites. 

Another test of a true indefinite is its use in donkey sentences. It is known 
from decades of research on the topic (Geach 1962; Lewis 2002[1975], i.a.) that 
indefinites allow for donkey anaphora. English indefinites show this property. 


(40) Every time I meet a student, me and him get into a fight. 


In (40), the encounters can refer to a different student each time, which is 
expected for true indefinites. The facts for ıx+NPs in ASL are the same as in En- 
glish, again indicating that they are ambiguous between definites and indefinites. 
In the example below, a locus for STUDENT has been set up and the pronominal 
forms in the utterance make use of reference to both, the space of the person 
uttering the sentence, and the locus for STUDENT. 


(41) EVERY-TIME I MEET IX, STUDENT, ME-IX, FIGHT. 
'Every time I meet a student, me and him get into a fight? 


Like the English example, the sentence in (41) can also refer to different en- 
counters with students, which illustrates that donkey readings are possible with 
IX - NPs. Given the facts of bare NPs and 1x+NPs in this section, I conclude that 
both bare NPs and ıx+NPs have a true indefinite reading. I can now build on this 
fact and encapsulate it within my proposal. 
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4.2 The basic proposal 


In this section, I follow the file card semantics of Heim (2002[1983]) to capture the 
patterns in the language observed earlier. Under this theory, information within 
an utterance can be metaphorically viewed as being stored in files. Each logical 
form of a sentence is also assigned a file change potential, which is a function 
from the file that obtains prior to an utterance to the file obtained after the ut- 
terance. The truth of the file is determined by the sequence of individuals that 
satisfy the file. This sequence is a function from a subset of natural numbers N 
into the domain of all individuals, for instance, for the pair of members a4 and ag, 
(a, a9) is the function which maps 1 to a4 and 2 to a, (Heim 2002[1983]: 228). 

Definites and indefinites in natural language, under this system, can be un- 
derstood through the Novelty/Familiarity Condition, as given in (42), where def- 
inites are familiar referents and indefinites are novel. 


(42) The Novelty/Familiarity Condition 
"Let F be a file, p an atomic proposition. Then p is appropriate with respect 
to F only if, for every noun phrase NP; with index i that p contains: 
If NP; is definite, then i € Dom(F), and 
If NP; is indefinite, then i € Dom(F)” Heim (2002[1983]: 233) 


The Novelty/Familiarity Condition simply states that definites are familiar ref- 
erents whose index is already in the domain of the file F, whereas indefinites are 
novel referents whose index is not in the domain of the file. Taking this basic 
notion of definites and indefinites into account, I can now proceed to analyze 
the ASL patterns discussed throughout. The basic proposal is this: 1x introduces 
a locus, which can be viewed as the introduction of a locus feature on the NP to 
follow. Bare NPs lack such a feature as they are not signed at a locus, i.e., a partic- 
ular point in signing space. Only bare NPs can refer back to bare NPs, while only 
NPs specified for a locus feature can refer back to loci because bare NPs are un- 
specified for them. What the specification of a locus feature in essence translates 
to is that bare NPs and 1x+NPs introduce different types of indices: one specified 
for loci and the other which is underspecified for a locus feature. These distinct 
indices would force an ıx+NP to be interpreted as a new referent even if there is 
a bare NP that could potentially serve as an antecedent. 


The data could potentially be accounted for by proposing that bare NPs do not introduce an 
index at all, although then one would have to propose an additional mechanism by which bare 
NPs can refer to each other as in (43). More data along these lines may allow to distinguish 
between the two alternatives. 
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Let me illustrate this idea with some examples: 


(43) a) JOHN BOUGHT BOOK. b) BOOK INTERESTING. 
‘John bought a book. The book was interesting’ 


(44) a) Ga JOHN, BOUGHT IX} BOOK}. b) Ix, BOOK} INTERESTING. 
‘John bought a book. The book was interesting’ 


(45) a) JOHN BOUGHT BOOK. b) #IXp BOOK} INTERESTING. 
‘John bought a book;. #A book; was interesting’ 


I take each of the above examples in turn and explain how they are interpreted 
in accordance with my analysis. In (43), neither of the bare NPs Book is specified 
for a locus feature. Therefore, the second instance of BOOK does not introduce 
an indefinite and it is interpreted as familiar. In (44), the first instance of BOOK 
with a locus feature introduces an indefinite index. The second instance of book, 
however, is signed at the same locus, referring back to the same index. Instead, 
BOOK in (44b) is necessarily interpreted as familiar. Finally, the example in (45) is 
key in understanding the proposed analysis. Book in (45b) is specified for a locus 
feature, while the bare NP Book is not. In that case, the second instance of book 
is interpreted as an indefinite, and the sentence is infelicitous under the reading 
that the same book is under discussion." 

Earlier in the paper, I showed that bare NPs and ıx+NPs are ambiguous be- 
tween definite and indefinite readings. Therefore, as per the Novelty/Familiarity 
Condition, both bare NPs and 1x+NPs can either introduce an indefinite or refer 
to a familiar expression. This rule for both bare NPs and 1x+NPs, given a file F, 
the domain of F Dom(F), and the set of sequences that satisfy F Sat(F), and an 
index i, is summarized in (46): 


(46) Ifie Dom(F), then Sat(F’) = Sat(F+b; € Ext(“NP”)); 
else, if iis € Dom(F), then Dom(F’) = Dom(F) u {i}. 


16] leave out the loci for JoHN in (43) and (45) for expository purposes. This does not affect the 
readings of the sentences in any relevant way. 

The sentence is perfectly acceptable with the reading that there is a novel book that is inter- 
esting - ie. when the two books do not corefer. 


(i) JOHN BUY BOOK. IX, BOOK, INTERESTING. 
‘John bought a book, A book; is interesting. 


The extent to which the above sentence is infelicitous in ASL may be compared to the 
English translation provided. 
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The analysis I have proposed here follows from the building blocks of Heim’s 
system: every NP in logical form carries an index, and the only distinction be- 
tween the two types of nominal expressions in ASL is their association with a 
locus. Let me now show how the mechanisms of this analysis emerge under the 
workings of file card semantics. There are two basic requirements for indefinite 
expressions as stated in (47): i) the index must not be in the domain of the file 
(Dom(F)), and ii) the satisfaction set of the file (Sat(F)) plus an atomic formula p 
must not be empty. 


(47 i¢Dom(F) & Sat(F+p) + Ø 


In ASL, when 1x+NP is introduced, a new file card is obtained if the index is 
not in Dom(F). 

When introducing an indefinite, the sequences in Sat(F+p) have to be longer 
than those in Sat(F). With these principles in place, I can work through the ex- 
amples in (43-45). Below, I provide the interpretation for (43). 


(48) Sat(Fo+(43a)) = Sat((Fo+[np, John] + [Np, a book] + [e1 bought e; ]) 
= ((b4,b3*: by € Ext (“John”), ba € Ext (“book”) and (b4,b2) € Ext (“bought”)} 


Here, I have thus far simply introduced extensions of sequences that were not 
in Dom(F), but whose sub-sequences satisfy F and p, by allowing for cases where 
F+p has a larger domain than F. I have not yet had to deal with cases with a 
familiar referent. Example (43b) is such a case, and I account for it as shown in 
(49): 


(49) Dom(F,)- {1,2} 
Sat(F2) = kb1,b2): ba € Sat(F4) and bz € Ext (“interesting”)} 


We already have the two file cards for 1 and 2 at this point. When (43b) is 
uttered, the file cards are updated accordingly. No new index is introduced as 
both instances of Book in this case are bare NPs unspecified for a locus feature, 
and Book in (43b) is understood as a familiar referent. Both instances of BOOK 
introduce the same index; thus, (43) can be summarized as (50): 


(50) John(x) & book(y) & bought(x,y) & interesting(y) 


The examples in (44) are interpreted in the same way as (43), even though 
both instances of Book here are specified for a locus feature. The interpretation 
of (44a) is shown in (51): 
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(51) Sat(Fo+(44a)) = 
= Sat((Fo+[np, John] + [np, a book] + [ey bought ez]) 
= {(b,b2): by € Ext (“John”), ba € Ext (“book”) and <by,b) € Ext (“bought”)} 


As seen above, the interpretation for (44a) is not different from (43a). Similarly, 
a novel index is not introduced when the second instance of BOOK is uttered in 
(44b), as it is also specified for the same locus feature. 


(52) Dom(F;)- {1,2} 
Sat(F5) = Kb1,b2): ba € Sat(F4) and b, € Ext (“interesting”)} 


Therefore, in sum, for (44) we also get: 
(53) John(x) & book(y) & bought(x,y) & interesting(y) 


The interpretation for (43) and (44) does not work out differently as the second 
instance of BOOK in both cases is familiar, as both NPs for Book are either bare 
NPs or 1x+NPs. A different result is obtained when the first NP for Book is a bare 
NP and the second NP has a locus feature. 

For (45), part (a), which contains novel expressions, is the same as the inter- 
pretations for (43) and (44) as no decision about the familiarity or novelty of the 
referent has to be made. 


(54) Sat(Fo+(45a)) = 
= Sat((Fo+[np, John] + [Np, a book] + [eı bought ez ]) 
= {(b,b2): by € Ext (“John”), ba € Ext (“book”) and (b4,b2) € Ext (“bought”)} 


(45b), however, is different. The first instance of Book in this case was a bare 
NP, one not specified for a locus feature. On the other hand, Book in (45b) is spec- 
ified for a locus feature. Since the index for the bare NP Book was underspecified 
for a locus feature, it cannot be the same one as IX NP Book, and hence, a distinct 
index for the second instance is introduced. 


(55) Domf(F}) = {1,2,3} 
Sat(Fı+(45b)) = 
= Sat((Fı + [np, a book]) + [e3 interesting]) 
= Kb1,b2,b3): b3 € Ext (“book”) and b3 € Ext (“interesting”)} 


Thus, for (45), the interpretation in (56) is obtained, which is unlike (43) and 
(44): 


133 


Ava Irani 


(56) John(x) & book(y) & bought(x,y) & book(z) & interesting(z) 


It can be seen above that the second instance of Book is interpreted as an 
indefinite, which renders the pair of sentences infelicitous under the reading 
where the two books refer to the same entity. The Book in (b) cannot refer to 
the one in (a) as (45a) is unspecified for a locus feature. 

Now that I have shown how the analysis plays out, I need to explicate the 
relationship between loci, bare NPs and indices. I have already stated that both 
IX- NPs and bare NPs introduce indices, but what kind of indices does a locus 
and a bare NP introduce? From the analysis laid out so far, I propose that bare 
NPs are underspecfied for a locus as the language allows for a locus feature to be 
associated with NPs. This locus feature is specified according to the index they 
take. The following section elaborates further on the final point, but for now I 
can formalize the two types of indices as those underspecified for a locus feature, 
and those specified for it. Bare NPs take the former kind, which can be denoted 
using Greek letters, a, p, etc. ıx+NPs take indices of the type a, b, c, etc., the 
kind which is specified for a locus feature. Thus, for the sentences in (43-45), a 
particular kind of index is obtained depending on whether the NP is associated 
with a locus or is a bare NP. With this updated proposal, let me revisit the 
example in (43), and illustrate its updated representation under this system. The 
interpretation for (43a) is provided in (57): 


(57) Sat(Fo+(43a)) = 
= Sat(Fo+[np, John ] + InP, a book] + [eg bought eg]) 
= (bob): ba € Ext (“John”), bg € Ext (“book”) and <by,bg) € Ext 
(“bought”)} 


Notice that in (57) the numerical indices are now represented by o and f to 
illustrate the underspecification of the locus feature. The type of indices we are 
dealing with is now transparent. Since (43b) also makes use of bare NPs, no new 
file card is introduced and the utterance is interpreted as familiar, as is shown in 
(58). 


The underspecification of indices for a feature is not unique to ASL. Persian pseudo- 
incorporated nominals are argued to display a similar property (Krifka & Modarresi 2016), 
where the discourse referents introduced by these NPs are underspecified for number. Covert 
pronouns are also said to lack number features, while overt ones are marked for number. Krifka 
& Modarresi show that overt pronouns require number marked NPs, whereas covert pronouns 
do not. This analysis is parallel to what I propose here for ASL NPs with a locus feature. 
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(58) Dom(F,) = of 
Sat(F2) = (4b, bg): bg € Sat(Fı) and be € Ext (“interesting”)} 


Thus, for (43) we get (59): 
(59) John(x) & book(y) & bought(x,y) & interesting(y) 


Now that I have presented bare NPs introducing indices of the type a and f, 
I can account for (44) in a similar manner by evoking indices of the type a and 
b, which are specified for a locus feature. The interpretation for (44a) is provided 
in (60). 


(60) Sat(Fo+(44)) = 
= Sat(Fo+[np, John] + [np, a book] + [e, bought ey ]) 
= {<b,,bp>: b1 € Ext (“John”), bj € Ext (“book”) and (b,,bp) € Ext 
(“bought”)} 


Example (44) is understood in the same way as example (43), except with the 
use of NPs that are associated with a locus. Book in (44b) is also interpreted as a 
definite expression. 


(61)  Dom(F4) = {a,b} 
Sat(F5) = (b, bp: by € Sat(Fı) and b; € Ext (“interesting”)} 


In sum, for (44) we get (62): 
(62) John(x) & book(y) & bought(x,y) & interesting(y) 


It now becomes apparent an interaction between the two systems in (45), 
which ultimately does not result in the desired interpretation. The bare NPs in 
(45a) introduce an index unspecified for loci, but 1x+NP in (45b) introduces an 
index with a locus feature. First, the interpretation of (45a), which contains novel 
expressions, simply introduces indefinites like in (43a). 


(63) Sat(Fo+(45a)) = 
= Sat(Fo+[np, John] + Jun, a book] + [eg bought eg]) 
= bu,bpd: ba € Ext (“John”), bg € Ext (“book”) and <by,bg) € Ext 
(“bought”)} 


(45b), in contrast, is different. Here familiar reading of Book is not obtained 
as this NP is associated with a locus. It introduces an index X, which is not an 
index of a type underspecified for a locus feature. Thus, it introduces a new file 
card and the second instance of Book is understood as an indefinite expression. 
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(64) Dom(F;) = {a,f,a} 
Sat(F,+(45b)) = 
= Sat((Fı + [np, a book]) + [e, interesting]) 
= {(Dq,bg,ba): ba € Ext (“book”) and b, € Ext (“interesting”)} 


As a result, the interpretation for (45) is the following: 
(65) John(x) & book(y) & bought(x,y) & book(z) & interesting(z) 


The analysis presented above illustrates two main points: one, NPs in ASL can 
be either specified or underspecified for a locus feature; and two, an NP specified 
for a locus feature cannot refer to an NP that is underspecified for them. Given 
this system, the infelicity of a definite reading with 1x can now be predicted in 
expressions like (45b). 

Finally, my proposal allows to explain some examples presented in the litera- 
ture regarding 1x without an NP. Koulidobrova & Lillo-Martin (2016) also argue 
that ıx without an NP is not a pronoun, against previous claims in the litera- 
ture (Kuhn 2015). This proposal now allows to decide between the two sides of 
the debate, as I can lay out the arguments against IX as a pronoun, and show 
that they do not hold under the current analysis. I have already established that 
Ix+NPs and bare NPs introduce two flavors of indices that do not interact with 
each other. An ıx+NP will be interpreted as an indefinite expression unless it 
has an 1x+NP antecedent with the same specified locus feature. The argument 
against IX as pronoun is based on evidence like the following: 


(66) PETER THINK IX, / IX-neu SMART. 
Peter; thinks beau is smart. (Koulidobrova & Lillo-Martin 2016: 241) 


(67) a. WHEN ONE „CL STUDENT COME PARTY, „IX HAVE-FUN. 
"When a student; comes to the party, be has fun’ 


b. WHEN ONE STUDENT; COME PARTY, „IX/neu-[CL IX] HAVE-FUN. 
"When a student comes to the party, he«; has fun’ (Schlenker 2010: 18, 
as cited by Koulidobrova & Lillo-Martin 2016: 242) 


The line of reasoning here is that 1x cannot refer back to the bare NP as in (66), 
which would be odd given the pronominal nature of 1x. The mystery absolves 
itself under the present approach, wherein the bare NP and 1x+NP introduce in- 
dices of different types. The example in (66) shows that the first instance of 1x 
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cannot refer back to Peter, but to another individual, which is completely pre- 
dictable if it is assumed that rx, similar to Iıx+NPs, cannot refer back to bare NPs 
as they are specified for a locus feature. 

The system of NPs being specified or unspecified for a locus feature allows to 
view the function of loci differently. They are not merely the realization of indices 
in the language - they also allow to keep track of discourse referents. Specifying 
an NP for a locus feature is, then, simply more efficient than using bare NPs. 
Certainly, I do not wish to make a strong functional claim here in which ease of 
processing drives the use of loci. I am only stating that a signed language has the 
option of using loci, and ASL makes use of this option. 

Throughout this section, I have underlyingly assumed that loci are features, 
a fact that has been proposed previously for ASL (Kuhn 2015; Schlenker 2016). 
Since this assumption is non-trivial, I discuss it further in detail in the following 
section. 


4.3 Loci as featural variables 


The notion that 1x consists of a locus feature and bare NPs are underspecified for 
them integrates previous proposals, namely that of featural variables (Schlenker 
2016). A featural variable analysis of loci accounts for the ability of loci to be 
reused and shared, and for features to be uninterpreted under only, a fact that 
has been noted for the language (Kuhn 2015). Below, I discuss the arguments 
for a featural variable analysis, and then show how my analysis fits in with this 
approach to ASL. 


4.3.1 Arguments for loci as features 


The motivation for a featural variable approach consists of two parts: arguments 
for loci as morpho-syntactic features and arguments for loci as variables. I discuss 
both aspects of the analysis so that I can examine how this proposal relates to 
the other facts of the language. I start with arguments for loci as features in this 
section. 

There are several crucial facts that illustrate the need for ASL loci to be ana- 
lyzed in part as morpho-syntactic features. Loci can be reused, shared, and the 
features of the NP associated with the locus can be uninterpreted under only. I 
illustrate each of the above facts below in turn. 

Prima facie, loci can be reused since loci do not remain associated with a par- 
ticular entity for longer than a conversation. Moreover, loci can be reused even 
within the same conversation. 
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(68) KINDERGARTEN CLASS STUDENTS IX-arC4p STUDENTS PRACTICE DIFFERENT 
COMPLIMENTS. FIRST, IX, ALAN, TELL IX} BILL) IX, ADMIRES IX}. SECOND, 
IX, CHARLES, TELL IX} DANIELLE IX, LIKES POSSp STYLE. THIRD, IX, EVE 
TELL IX} FRANCIS} IX, THINK Dt HANDSOME. 
‘In a kindergarten class, the students were practicing different compli- 
ments. First, Alan; told Bill; that he; admires him;. Second, Charles told 
Danielle that he likes her style. Third, Eve told Francis that she thinks he’s 
handsome. (adapted from Kuhn 2015: 462) 


Example (68) demonstrates how the loci a and b can be reused for every pair 
referenced in the sentences. Therefore, there is no one-to-one correspondence 
between loci and discourse referents throughout single discourse. Under this ap- 
proach, the introduction of a distinct NP even with the same locus feature asso- 
ciated with it, would introduce a new index, and thus, the loci get reused. 

The argument that there is no one-to-one correspondence between loci and 
variables is, furthermore, bolstered by the fact that loci can be shared. This is 
illustrated below: 


(69) EVERY-DAY, IX, JOHN, TELL IX} MARY, IX4 LOVE IX}. BILL; NEVER TELL SUZYp 
IXp LOVE IXp. 
‘Every day, John; tells Mary; that he; loves ber. Bill, never tells Suzy, that 
he, loves her,’ 


Example (69) shows that two referents can be situated at one locus - therefore, 
it appears that loci can be shared. This property further undermines the strong 
one-to-one correspondence between loci and variables. 

Another argument that shows the need to evoke features on loci arises from 
the uninterpreted phi-features on pronouns under focus-sensitive operators like 
only. Let me first consider the following English sentences: 


(70) a. Only Mary did her homework. 
b. Only I did my homework. 


Example (70a) entails that John did not do his homework even though he is 
male, and example (70b) entails that John did not do his homework even though 
he is not the speaker. Thus, in English both gender and person features can be 
uninterpreted under only. These facts are paralleled by the ASL loci examples as 
well: 


(71) Ga JESSICA, TELL-ME IX}, [BILLY ONLY-ONE]y FINISH POSSp HOMEWORK. 
Bound reading: Jessica, told me [only Billy, ] Az.z did zs homework. (Kuhn 
2015: 9) 
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If there was a one-to-one relationship between the locus and the index associ- 
ated with it, then it is unexpected that the gender feature can be deleted such that 
it is able to refer to persons not associated at that locus. In other words, BILLY at 
locus b should be impossible to consider JEssiCA, signed at locus a, as a value for 
the index associated with locus b. The fact that the sentence signed at locus b can 
refer to entities outside that set indicates that some features at the locus can be 
uninterpreted. In this case, the locus feature is uninterpreted and reference can 
be made to both BILLY and JESSICA. 

In this section, I have presented arguments to abandon the view that there is 
an absolute one-to-one correspondence between loci and variables. I have also 
shown that the ASL data presented here are compatible with an analysis that 
analyzes loci as features. The following section presents an overview of the ar- 
gument that variables are not obsolete in analyzing loci. 


4.3.2 Arguments for loci as variables 


The evidence for loci being composed of features is convincing, but there are 
also reasons for which I would not want to opt for a completely variable-free 
analysis. In addition to the fact that loci generally refer to the individual they 
are associated with, as seen in 82, Schlenker (2016) argues for another reason 
to retain variables: iconic bound loci, which refer to an individual's importance, 
height, or position. Loci in such instances can be set up high or low to indicate 
the aforementioned aspects, which makes them iconic. It appears that in these 
cases not all features under only get deleted and the iconic height feature on the 
locus remains intact. 

Iconic bound loci in ASL can be easily captured in a variable account of loci, 
but the account for iconic bound loci under a variable-free analysis is not straight- 
forward. The examples below illustrate that in ASL, high loci can be used to refer 
to tall, powerful, or important individuals, and the height of the loci is still inter- 
preted under binding and under only (Schlenker 2016). 


(72) GYMNAST COMPETITION MUST STAND BAR FINISH STAND HANG. 
'In a gymnastics competition one must stand on a bar and then go from 
standing to hanging position. 
a. ALL GYMNAST IX4-neutral WANT IX-1 LOOK,-high FINISH FILM Ix,-low. 
‘All the gymnasts want me to look at them while they are up before 
filming them while they are down: 
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b. ONLY-CL GYMNAST IX,-neutral WANT Ix-1 LOOK,-high FINISH FILM Da: 
low. 
‘Only one of the gymnasts wants me to watch her while standing be- 
fore filming her while hanging’ (Schlenker 2016: 1081) 


Example (4.3.2) shows that although phi-features under only can be uninter- 
preted, the height feature must necessarily keep its positional association intact. 
Therefore, iconic bound loci lend evidence to an analysis of loci that also makes 
use of variables. These facts now lead to a featural variable analysis of ASL loci. 
Combining both aspects of loci, Schlenker (2016) proposes a featural variables 
analysis, which I expand on in the next section. 


4.3.3 Featural variables 


The facts noted earlier in the paper show the need for an approach of loci that 
accounts for them as both features and variables. A featural variable analysis 
(Schlenker 2016) provides a platform to do exactly that. Below, I discuss how the 
cases of locus reuse, locus sharing, and interpretation under only are accounted 
for under Schlenker's analysis.? 

Let me first lay out the tools needed to address the observed patterns. I showed 
that features can be deleted under focus operators; therefore, a deletion rule is 
needed. Below are rules that result under a semantic or a morpho-syntactic ap- 
proach. The following rule under a semantic analysis allows a feature F on a 
pronoun to remain uninterpreted under focus. For expository purposes, I discuss 
Schlenker's illustration of the deletion of a potential feminine feature. 


(73) “Let E bean expression of type e and f a feminine feature, F a focus marker, 
and [[«]]9**W the ordinary and focus values of a under a context c, an 
assignment function s and a world w. 


a. [[Ef]]Ce*v = # if [[E]] Po*w = # [[E]]O.^*V is not female in the world 
of c. If (EJ Team ##, [[Ef]] C. es w SS [[E] Pew 

b. [[Ef]]Pe*w = [[E]]O' 6 (i.e. the feature f plays no role in the focus 
dimension.) 

c. [[EE]]Po** = [[Eg]]P6*'w = E, the set of individuals” (Schlenker 2016: 
1070) 


PSee Schlenker (2016) for a complete account of how a featural variable system can incorporate 
the various properties of loci. 
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The above rule states that an expression with a feminine feature f results in 
a presupposition failure if and only if the expression itself results in a presup- 
position failure or if the expression is not female in the world with context c. If 
the expression does not result in a presupposition failure, then the feminine fea- 
ture plays no role in the focus dimension. Another alternative to feature deletion 
under focus is the deletion under agreement rule, which tethers to a morpho- 
syntactic approach. The rule below optionally requires a feature F to be uninter- 
preted if a pronoun is bound by an element with feature F; i.e. when the features 
agree. 


(74) a. “Optionally delete feature F of a variable v” if (i) v" appears next to a A- 
abstractor Av" and the appearance of Av? is triggered by an expression 
with feature F, or (ii) v is bound by Av’. 

b. A-abstractors inherit the features of the expressions that trigger their 
appearance.’ (Schlenker 2016: 1071) 


As opposed to the rule in (73), (74a) provides us with a deletion under agree- 
ment approach. (74a) simply states that a feature on a variable gets deleted when 
the variable appears next to a A-abstractor, whose occurrence is triggered by an 
expression with that feature, or if the variable is bound by the A-abstractor. The 
rules above allow to account for cases where the features of an entity associated 
with a loci are uninterpreted. 

Although these rules can straightforwardly account for the deletion or unin- 
terpreted features under focus operators, there is another option available for 
locus sharing cases. Below is the relevant example in (69) originally discussed by 
Kuhn (2015) repeated below as (75). Here, JOHN and MARY share locus a and BILL 
and suzy share locus b. 


(75) EVERY-DAY, IX, JOHN, TELL IX, MARY, IX, LOVE IX}. IXp BILL NEVER TELL 
IXp SUZYp IXp LOVE Di 
‘Every day, John; tells Mary; that he; loves her;. Bill, never tells Suzy, that 
he, loves hery. (Schlenker 2016: 1073) 


The pattern noted above can be captured via deletion under agreement (74a). 
For a deletion analysis, one can simply say that the a locus feature get deleted 
under agreement as shown below. 


(76) John, Ai? Mary Ak? t$ tell tẹ [prof love pro] (Schlenker 2016: 1079) 
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However, it does seem a bit odd that one would be able to refer back to a 
locus after its features have been deleted.?? Schlenker also proposes another al- 
ternative where perhaps in the example above, John and Mary form a plurality 
of individuals, and 1x only refers to a part of this plurality of individuals. Given 
that the contribution of loci is sensitive to the assignment function s, and an ex- 
pression E associated with a locus a, one can say that it is required that E in these 
cases denotes a part of what a denotes. A general part-denoting rule for loci can 
thus be spelled out as follows: 


(77) “For every locus a # 1,2, if E is an expression of type e, [[E*]]©°™” = # iff 
[[E]]*?" = 4 or [[E]]©*™” isn't a mereological part of s(a) or [[E]]^?V is 
present in the situation of utterance in c and 1, [[E]]^?" and a are not 
roughly aligned. If [[E^]]^?" + #, [[E?]]9*" = [[E]]9? "* (Schlenker 2016: 
1080) 


This rule proposes that the locus denotes the plurality Johne Mary, and one is 
referring back to a part of that expression. The expression E has to be a mereolog- 
ical part of the the assignment function that maps on to the locus. Hence, there 
are now two options of dealing with the locus sharing examples: via deletion 
under agreement (74a) or via a denotation of parts (rule 77). 

Schlenker's rules allow to capture the properties of loci observed by Kuhn. The 
deletion rule can be evoked for the breakdown of the one-to-one correspondence 
under a focus operator like only. Moreover, the rule stated in (46) must be modi- 
fied in order to account for the locus sharing instances. First, I note as Kuhn did 
that these examples, like the one in (75), are heavily dependent on the right con- 
text. They become possible when the discourse facilitates its use using parallelism 
between the two sentences or a similar mechanism, but they are not ordinarily 
judged as unexceptional. Taking that into consideration, the rule stated in (46), 
repeated in (78), can now be accordingly modified. 


(78) Ifie Dom(F), then Sat(F’) = Sat(F+b; € Ext(“NP”)); 
else, if iis € Dom(F), then Dom(F’) = Dom(F) u {i}. 
The loci sharing cases now require to add the following condition: 


(79) Ifie Dom(F), and b; € Ext( NP") is consistent with the context, then Sat(F’) 
= Sat(F+b; € Ext(“NP”)); 
else, if iis € Dom(F), then Dom(F’) = Dom(F) u {i}. 


2°Schlenker (2016) does not provide any further details on how a deletion analysis captures cases 
like (75). Without this supplementary information, the merits of appealing to feature deletion 
here are yet to be seen. 
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By adding the consistency with the context requirement in (79), now more 
than one NP can be associated with the same locus. When a second NP is signed 
at the same locus as a previous NP, it is considered a novel referent once context 
has determined that the second NP is not equal to the first. In other words, when 
MARY is signed at the same locus as JOHN, the inconsistency in the context that 
John is not Mary, leads me to conclude that the index is not in the domain of 
the file. There are scenarios that can push this claim further. For instance, if an 
individual is both a linguist and a student, the interpretation of signing the two 
at different loci or at the same locus can be informative. This point will not be 
addressed in more detail here, but I note that this rule does not allow to distin- 
guish between the two alternatives of dealing with loci-reuse and sharing cases 
proposed by Schlenker. This formulation is compatible with either a feature dele- 
tion account or a part-whole account of the phenomenon. Below, I dwell on these 
possibilities a little longer. 

For the purposes of my analysis of 1x, I need to say nothing further. The exam- 
ples noted by Kuhn suggesting that 1x is composed of features is successfully in- 
tegrated into my approach by adopting the rules proposed by Schlenker that are 
described in this section. We now have a more complete picture of the nature of 
the ASL ıx. Even so, one can attempt to disambiguate between these two options 
of feature deletion or part-denotation by using the product-producer bridging ex- 
amples. Schwarz (2009) proposes that these cases require the representation of 
a null pronoun in the structure; thus, they behave like regular anaphoric strong 
definites (Schwarz 2009: 268). Therefore, the sentences in (80a) are structurally 
understood as (80b). 


(80) a. Ibought a book the other day. The author is French. 
b. Ibought a book the other day. The author (of it) is French. 


Such a proposal leads us to consider that the author in such cases was never 
introduced as a referent by itself, and it only exists in relation to the pronoun. 
One can employ a similar example in ASL, and by attempting to refer back to 
the locus associated with BOOK and AUTHOR with 1x (without an NP), it can be 
determined whether AUTHOR was introduced in the discourse if 1x can refer to 
it. Consider (81): 


(81) IX, JOHN, BUY IX) BOOK}. IX} AUTHOR} SELF FRENCH. IX, JOHN, TIRED 
TODAY. SLEEP. TWO HOURS LATER, WOKE-UP. THEN, REMEMBERED IX}. 
‘John bought a book. The author was French. John's tired today. He fell 
asleep. Two hours later, he woke up and recalled it. 
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My consultants maintain that the final pronoun rx in the example above can 
refer to either BOOK or AUTHOR. This example indicates that an index for each of 
these entities was introduced in the utterance. It seems that even though the Av- 
THOR in (81) was mentioned in relation to Book, ASL introduces a new index for 
it. This data points me towards the direction of the denotation of parts analysis 
of locus sharing and reuse cases since AUTHOR was separately introduced in the 
discourse at the same locus. It appears that BOOK and AUTHOR form a plurality of 
individuals associated with the same locus, and one can refer back to either part 
of the plurality using ıx and the rule in (77). Under a deletion analysis, capturing 
these facts is not straightforward. 

The example presented in (81) does not completely allow to differentiate be- 
tween the two alternatives. However, we do learn something about these product- 
producer bridging cases. Even in such examples, 1x allows to set up a new refer- 
ent for both the product and the producer, and one can return back to the locus 
associated with them later on in the discourse. For present purposes, I do not 
expand on these data further, but leave them open for future work. 

Throughout this section, I have provided evidence for loci being composed of 
features, and I have adopted a system of featural variables that allows to capture 
the full range of locus properties. These aspects are important for the analysis at 
hand as I crucially assume that bare NPs, unlike 1x+NPs, are underspecified for a 
locus feature. The difference between the two nominal types is not that one intro- 
duces an index and the other does not, but that the type of indices introduced by 
the bare NPs and 1x+NPs differ precisely in their specification of these features. 


4.4 Final points 


The analysis discussed here accounts for the distribution of 1x in definite and 
indefinite environments. Although I have discussed the proposal in detail, some 
judgments presented in the literature are not in line with those of my consultants 
and may need further investigation. I describe those examples in this section. 

Bahan et al. (1995) argue that 1x before NPs is a definite marker, but they do so 
on the basis of data that are incompatible with mine, at least as they stand. They 
claim that 1x -NP must necessarily be definite, which is at odds with the rx « NPs 
in donkey sentences seen earlier. They provide the example below: 


(82) # JOHN LOOK-FOR IX, MAN, FIX GARAGE. 
# ‘John is looking for a man to fix the garage’ (Bahan et al. 1995: 4) 


Example (82) is taken to show that the indefinite reading is unavailable with 
the use of 1x, as John is only looking for a particular man to fix the garage, not 
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any man. I do not agree with their argumentation here for two reasons: one, I 
have shown that ıx+NPs have an indefinite reading, and two, it is unclear what 
effects are expected when a locus is set up for an entity that is not used further in 
the discourse. In other words, it cannot be ruled out that the rx - NP MAN in this 
case is truly not indefinite, or if the infelicity is simply a result of introducing 
an entity that is set up to be continually referred to throughout the discourse. 
Moreover, my consultants do not agree with this judgement. Hence, I leave this 
example open for further investigation.?! 

Returning to the view arguing for IX as a demonstrative, Koulidobrova & Lillo- 
Martin (2016) also present a pair of examples that my consultants do not agree 
with. Therefore, I describe them here in order to address them in more detail. 
Taking into consideration that definite articles are known to carry covarying 
readings while demonstratives do not, Koulidobrova & Lillo-Martin argue that 
covarying readings are unavailable with 1x. Consider the English examples first: 


(83) That guy in the red shirt always wins. = referential / *covarying 
(Nowak 2013, as cited by Koulidobrova & Lillo-Martin 2016: 229) 


(84) The guy in the red shirt always wins. = referential / covarying 
(Nowak 2013, as cited by Koulidobrova & Lillo-Martin 2016: 229) 


The above examples describe two situations, one in which any unspecified 
individual wins, i.e. the covarying reading, and another in which one specified 
person wins, which is the referential reading. Both of the above examples allow 
for referential readings; however, only (84) allows for the covarying interpreta- 
tion. When the demonstrative that is used in (83), we do not get the reading for 
the rigged race where any person wearing red is the winner. This diagnostic is 
now applied to ASL to indicate that 1x behaves more like a demonstrative than 
a definite article. 


(85) IX, PERSON, / IX, RED SHIRT SELF TEND WIN. = referential / *covarying 
‘IX person / 1x in the red shirt tends to win’ (Koulidobrova & Lillo-Martin 
2016: 237). 


(86) PERSON HAVE RED SHIRT TEND WIN. - referential / covarying 
"Ihe person in the red shirt tends to win' (Koulidobrova & Lillo-Martin 
2016: 237). 


? One way of resolving this example would be to continue the discourse on the man, and check- 
ing to see whether the non-specific interpretation is available, but I do not have the relevant 
example at hand. 
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It appears at first glance that these examples are problematic for the proposal. 
However, I have already noted that ıx+NPs are perfectly compatible with donkey 
readings. Moreover, my consultants find a covarying reading acceptable in (85). 
Since there is a discrepancy in the judgments between consultants, it would be 
useful to retest these sentences with different contexts in order to clarify whether 
a covarying reading is truly unavailable in these cases. In retesting these cases, 
one should also be careful to test sentences that are only minimally different - 
(85) and (86) are not minimal pairs. 

The above examples, at least on the surface, are points of contention between 
the different analyses. Possibly, there is true inter-speaker variation in the lan- 
guage as the ASL signing community is extremely spread out. Nevertheless, as I 
have discussed, these matters are not immediately problematic for the analysis 
at hand without further investigation. 


4.5 Summary 


Before moving on to the implications of my analysis, let me summarize my find- 
ings thus far. After I present an overview of the various discussions in this paper, 
I contemplate the theoretical implications of this proposal in the following sec- 
tion. 

Previous work on ASL assumed that loci were the overt realization of an index 
introduced by discourse referents, and that 1x+NPs were demonstratives. In this 
paper, I showed that both bare NPs and ıx+NPs introduce an index, but these 
indices are of different types based on their specification or underspecification 
of a locus feature. In doing so, I also showed that both nominal types double as 
definite and indefinite expressions. This fact results in the nominals having the 
ability to either set up a new referent, or refer back to a familiar one if they have 
the same index. The ability to set up a new referent when the index is not in the 
domain of the file signifies that ASL definite expressions do not have a familiarity 
restriction. 

In spite of the lack of a familiarity restriction, I also showed that the two kinds 
of definite articles observed by Schwarz (2009; 2013) correspond to bare NP and 
Ix+NP in ASL when they are not indefinite. This is telling that perhaps definite- 
ness is not completely semantically void, and that it does hold in ASL, albeit only 
to an extent. The next section discusses the implications of the analysis provided 
in this paper. 
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5 Discussion 


Throughout this paper I have shown that the choice between bare NPs and 1x + 
NPs appears to be more or less unrestricted, barring the unique definite environ- 
ment cases, which is the only instance where 1x is not permitted. The examples 
seen in $3 indicate that there is some restriction on locus association with unique 
referents. However, one can imagine a scenario in which there are two unique 
referents under discussion. It appears that in these cases, the locus association is 
not completely ruled out. Consider the following example of a unique priest and 
a unique principal at a school. 


(87) ?1 VISIT SCHOOL. MET IX, PRINCIPAL}, IX} PRIESTy. IX, PRINCIPAL, NICE 
LADY. 
‘I visited the school and met the priest and the principal. The principal is 
a nice lady’ 


This example suggests that context can at least sometimes play a role in mak- 
ing IX felicitous with unique referents. Without delving into further detail, I 
leave open the possibility that uniqueness restrictions on IX may or may not 
consistently hold, although future work on such cases is necessary to determine 
whether definiteness in the language is semantically encoded. 


6 Conclusion 


The pattern of definite expressions in ASL and the proposal that resulted from 
it, can potentially pave the way to a new perspective on definiteness in this lan- 
guage. I have already shown that there is no familiarity restriction on definite 
expressions as a new referent can be set up if its index has not already been in- 
troduced. This tells us that definiteness might not be lexically encoded in ASL. 
IX was previously assumed to be an overt index, which might have taken up a 
special status. Given that both bare NPs and 1x+NPs introduce indices and can ei- 
ther be definite or indefinite, one may be led to rethink the nature of definiteness 
in ASL, and perhaps, in sign languages overall. 

Examining ASL indices and bare NPs has unveiled many aspects of the lan- 
guage in particular, and languages in general. It was first shown that the index 
IX when referring to a locus is a strong definite article, and bare NPs are weak 
definite articles that do not permit 1x. This pattern indicates that the language 
distinguishes between anaphoricity and familiarity on the one hand, and unique- 
ness on the other. On the flip side, it was shown that the language does not have 
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a restriction on familiarity; a new referent can be introduced if it is not already 
present in the discourse. 

In the literature, only ASL loci were typically viewed as indices. Here, rean- 
alyzing definite and indefinite expressions allows us to view things a bit differ- 
ently, as I proposed that bare NPs introduce indices as well. The double life of 
1x+NPs and bare NPs as definite and indefinite expressions, which do not have a 
familiarity restriction imposed on them, suggest that we are not dealing with a 
system that lexically encodes definiteness. Instead, I find that pragmatics might 
play a huge role in facilitating conversation, and in a language that has the option 
of using loci, the specification of a locus feature can play a role in determining 
whether or not an expression has been introduced. 

Finally, the data reported in this paper are the judgments of three ASL signers. 
Future work on the topic would greatly benefit from experimental work investi- 
gating native speaker intuitions on a greater scale. There is known to be signif- 
icant interspeaker variation in the community, and any such variation could be 
captured by surveying a larger group of ASL signers. 
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A nascent definiteness marker in 
Yokot'an Maya 


Maurice Pico 


Leiden University 


This paper examines the characteristics of a nascent definiteness marker in the 
Yokot'an language from the Mayan family from both a synchronic and diachronic 
perspective. The paper examines the contemporary distribution of the determiner 
ni, comparing it to that of the enclitic ba, which roughly corresponds to a topic 
marker. It employs Centering Theory to analyze oral materials, concluding that 
the use of the two particles is partially motivated by the processing cost of atten- 
tional shifts. Given that the determiner ni has been argued to develop from the 
distal demonstrative jini through grammaticalization, a diachronic perspective is 
also considered. The different synchronic uses of the determiner illustrated in this 
paper are then compared to the grammaticalization stages proposed for the devel- 
opment of definite articles. Both approaches ultimately suggest that ni conveys 
definiteness based on discourse-salience, not identifiability. The diachronic analy- 
sis further suggests that ni has started to bear some contrastive meaning related 
to reference in restricted contexts (reference to kinds in generic statements and 
specific reference in negative existential statements), indicating that the use of ni 
has spread beyond a pure topicality marker. Furthermore, the synchronic textual 
analysis in terms of Centering Theory clarifies some of the claims in Grammat- 
icalization Theory regarding the early stages of definite articles by linking their 
emergence to the need of flagging attentional shifts in utterance-by-utterance pro- 
cessing of discourse. 


1 Introduction 


Yokot'an, a Mayan language from the Ch'olan branch spoken in the state of 
Tabasco, Mexico, makes use of demonstratives and deictic enclitics as NP mod- 
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ifiers but has also developed a reduced form ni which no longer seems to have 
deictic value (1).! 


(1) A-x-e tä | num-e t-u-pat ni bojte’. 
ERG2-gO-IPFV PREP pass-INF PREP-ERG3-back DET fence 


(Car-f): “You are going to pass behind the fence: 
[chf MG CAR 28-30 (1:32-1:35), Delgado-Galván 2018] 


In her 1984 dissertation on Yokot'an morphosyntax, Knowles-Berry (1984: 209) 
proposes ni as a "definite determiner”, but does not attempt to illustrate any fur- 
ther such characterization of its behavior. The goal of this paper is precisely to 
explore the main functionality of the determiner ni of Yokot'an. I will show that 
this determiner does not easily fit the usual characterization of definite articles 
as items with high textual frequency conveying familiarity, uniqueness of refer- 
ence or identifiability via general knowledge (Himmelmann 2001: 832). Instead, 
I will show through textual analysis of oral materials that the distribution of ni 
exhibits a discourse-salience related role. I will unravel the main function of ni 
on the basis of two axes. The first axis is the synchronic perspective whereby 
ni overlaps in function with a topic marker, the enclitic ba. This overlapping 
relation will emerge through textual analysis performed with the help of a the- 
ory developed within computational linguistics: Centering Theory (Grosz et al. 
1995). The second axis is the diachronic perspective whereby the form ni is a re- 
duction of the distal demonstrative jini, a form that has been reconstructed all 
the way up to Proto-Mayan *ha» in, through intermediary reconstructions *hin+i 
for Western Ch'olan and *ha’in+i for Proto-Ch'olan.? While the diachronic rela- 
tion ni — jini has been proposed and argued for elsewhere (Mora-Marín 2009), 
it will be my contribution to try and relate synchronic uses of ni with different 
stages attested in the grammaticalization theory of articles. Furthermore, I sug- 
gest that the textual analysis in terms of Centering Theory links together two 


The abbreviations used in the examples can be found at the end of this paper. I have replaced 
the labels A and B used for pronominal indexes in traditional Mayan linguistics by the more 
standard ERG and Ass, respectively. A disadvantage, however, is that such glosses misleadingly 
suggest that the corresponding forms always convey ergative or absolutive grammatical rela- 
tions, which is not accurate. Firstly, the same set of pronouns are also used in the nominal 
domain for possession and predication (respectively), and secondly, if seen as an “ergative” 
language, one must concede that Yokot'an presents a split on imperfective clauses. 

"Throughout this paper, I will make use of a practical alphabet for the transcription of examples 
from Yokot'an, which conforms to the extent possibble to current practice in Mayan languages 
with a standardized alphabet. The values of the orthographic symbols are as expected but 
for á-[»], ch- [tf], x=[f], j=[h], and ’=[?]. The only exception will be in the context of Mayan 
historical linguistics where, following its tradition, I will write h=[h]. 
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independent observations on the grammaticalization of definite articles and il- 
lustrates how they fit together, thereby providing a better understanding of the 
early stages of grammaticalization of articles and their initial parallelism with 
the development of topic markers from demonstratives. No attempt whatsoever 
is made to put forward a semantic characterization of the meaning of ni, but I 
hope that this first text-oriented and functional analysis will lay out the ground 
that will make possible such undertaking. 

This paper will be organized as follows. In 82, I review the standard concep- 
tion of definiteness as rooted in uniqueness or familiarity of reference. I then 
show that neither seems a natural choice to represent the main motivation be- 
hind the use of ni. Moreover, I discuss the relative optionality of ni to argue that 
its function is likely sensitive to discourse-management motivations. In 83, I turn 
to an utterance-by-utterance discourse analysis of the texts to justify a discourse- 
salience definiteness for ni, or, as Walker & Prince (1996) would put it, a view of ni 
as a marker of the "Discourse-status" of the entity evoked by an NP, as opposed 
to its "Hearer-status" (its availability in the background knowledge of speaker 
and/or hearer). To this end, in 83.3, I illustrate attentional transition types in 
Yokot'an discourse within the Centering Theory framework, with preliminary 
concepts given in $3.1-83.2. In 83.4, I show the association of ni occurrences 
with attentional transitions of some type, where its functional overlap with the 
topic marker ba will be apparent. In $4, I incorporate a diachronic perspective 
by looking at the current distributional properties of ni through the glass of the 
well-attested path of grammaticalization from demonstratives to articles. In $4.1, 
Iassess whether the determiner ni has departed from being a demonstrative and 
I do so by following two criteria: a quantificational one (84.1.1) and a qualita- 
tive one (84.1.2). Once we have seen that ni has undergone progress along the 
grammaticalization path towards a definite article, away from its demonstrative 
source, the stages proposed in the literature of grammaticalization become rele- 
vant and I proceed, in $4.2, to pinpoint the stages at which ni currently stands 
with its several uses. The textual distribution that the Centering Theory analysis 
revealed in 83.4 now comes to clarify how two independent observations on the 
early stages of definite article grammaticalization fit together. Finally, given that 
topic markers can also develop from demonstratives, I point out in $4.3 a special- 
ized use of ni as a marker of specific reference, which happens in the restricted 
context of negative existential constructions. In this way, I argue that neverthe- 
less its main function as a discourse marker of topicality shifts, ni is better seen 
globally as a nascent definite article based upon salience-management, rather 
than as a pure topic-marker. In $5, I summarize the conclusions of this first study 
on the determiner ni. 
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We are now ready to initiate $2, where I will show that the standard cognitive 
correlates associated with definiteness are not sufficient to explain the distribu- 
tion of ni. Furthermore the reader is faced with the scarcity and seeming option- 
ality of the form ni. This will motivate the view of ni as a discourse-oriented 
particle. 


2 Which sort of definiteness for ni? 


In this section I illustrate some of the difficulties that can be encountered when 
trying to understand the contribution of a previously undescribed determiner. 
I briefly compare the distribution of ni with what would be expected from the 
standard treatment of definiteness which is informed by the historical debates 
on definite descriptions in more familiar languages. This will make apparent the 
need to move on to discourse motivations behind the use of ni, which then will 
be seen as a marker of NP discourse status (or of transitions between them) at 
the end of 83. Reasons to maintain ni as a nascent definite determiner rather 
than as a purely pragmatic particle will be apparent in 84 with the insights from 
Grammaticalization Theory. 

The treatment of definiteness in linguistics emerged from an originally philo- 
sophical debate around the contribution of the so-called "definite descriptions" 
to the meaning of the utterances in which they appear. Most accounts of defi- 
niteness take definite descriptions to denote identifiable referents and are built 
around three main ideas: 


e The definite article indicates the identifiability of the NP's referent. 


« Identifiability stems from the uniqueness of the referent that satisfies the 
descriptive content of the NP (within a given situation), or 


e Identifiability stems from such referent being already familiar to both the 
speaker and the addressee - in particular through previous discourse men- 
tion - regardless of its descriptive uniqueness. 


These ideas have been exploited independently or in a combined fashion.? The 
intuition that definiteness involves the uniqueness of the referent is motivated by 


>The initial philosophical discussion can be found in Frege (1892); Russell (1905) and Strawson 
(1950). For modern accounts of definiteness as uniqueness I refer the reader to Hawkins (1978; 
1991) and Abbott (1999). The familiarity perspective is embodied by a dynamic semantic anal- 
ysis of anaphora resolution. This kind of analysis embeds utterance interpretations into their 
discourse context to allow for inter-sentential anaphora resolution, including anaphoric def- 
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cases where the referent is picked out through an immediate and unambiguous 
situational availability without the need of any previous linguistic co-text (see 
Hawkins 1978: 103, 110). Example (2) illustrates such cases: 


(2 Context: In a carpentry workshop after some time silently working 
together. 
Could you please hand me the smoothing plane on the workbench? 


Familiarity, on the other hand, aims to reflect cases like (3), where no visual/si- 
tuational input is needed for the hearer to properly interpret the utterance, rather 
relying on a previous mention: 


(3) While I was fixing my bike yesterday, a man and a woman approached me 
and asked for directions. The man had a strange accent. I couldn't guess 
where he was from. 


When a definite article is known to have developed diachronically from a 
demonstrative, uniqueness and familiarity can both be seen as an outcome of a 
specialized use of deixis. Uniqueness within a situation would then develop from 
spatial exophoric uses of a demonstrative while familiarity would develop from 
anaphoric uses (there is some discussion about whether one use is more funda- 
mental, see Lyons 1999: 160). Some languages even develop two different articles, 
each specialized in one of the uses, an article for expressing uniqueness-based 
definiteness (which would correspond to a weak article in Schwarz 2013) and 
another for expressing familiarity-based definiteness (corresponding to a strong 
article in Schwarz 2013). Given that ni likely originates from the distal demon- 
strative jini one may be led to expect it to fit the previous picture. However a first 
difficulty arises already with its rather scarce presence in texts, as compared to 
the rather common situation in which an entity has already been mentioned or 
in which it is ostensibly unique or perceptually salient in the context. 

Interestingly, the oldest texts that I could consult of modern Yokot'an (two 
texts collected by Keller & Harris in 1946 or earlier) do not contain a single occur- 
rence of the determiner ni. Its absence in a given corpus, especially a corpus not 
exceeding two pages, cannot be taken as evidence of non-existence, however. If 
we assume that the determiner ni was already in the language, I find it significant 


inite NPs. Discourse Representation Theory (Kamp 1981) and File Change Semantics (Heim 
1982) are the main starting points in the formalization of this idea. Both characterizations of 
definiteness have also been combined in other accounts either to jointly provide a treatment 
of a given definite article (Farkas 2002; Roberts 2003) or to account for different articles with 
their own specialized meaning contribution (Schwarz 2009; 2013). 
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that "indefinite" NPs are always retaken as bare nouns without any determiner, 
as I show in the textual sequence (4a-c). Both ajydx ‘crab’, and ixmuch ‘frog’, are 
introduced with the numeral ‘one’ (+ classifier), but their respective references 
are resumed later with bare nouns, rather than with a sequence ni+N.* 


(4) Yokot'an (Tapotzingo, Nacajuca; Keller & Harris 1946: 138) 


a. Ajn-i um-p'e aj-yax. Näts’äti’ pa 
be.located-Prv[ABs3] one-NUM.CLF CLF.M-crab close mouth lake 
y-otot. [...] 


ERG3-house 
“There was a crab. Near the bank of the river was his house. [...]’ 

b. Bix-i tä — wa'wa'n-e pan ji. I 
go-PFV[ABS3] PREP drift.around-INF PREP sand and 
u-nuk’t-an un-tu ix-much. 

ERG3-find-1Prv [ABs3] one-NUM.CLF CLF.F-frog 
‘He went for a walk on the sand. And met a frog: 

c. U-pek-än tä — ts'aji. Ix- much uy-äl-e’ tan 
ERG3-call-ıprv[ABS3] PREP chat crr.r-frog ERG3-say-IPFV[ABS3] PREP 
u-k'ajalin: ya’ kä-x-e ka-xik’-e’ aj-yàx. 
ERG3-mind sp ERG1-go-IPFV ERGI-fool-1iPFv[ABS3] CLF.M-crab 
‘He [the crab] spoke to it [the frog]. The frog talked in his mind: “Tm 


going to make a fool of the crab"? 

This could suggest that the determiner ni is not associated with anaphorically- 
based familiarity, i.e. it is not an article of the strong type, in terms of Schwarz 
(2013). Thus, ni would not be required for an NP to be interpreted as referring 
to the same entity than previously introduced in the discourse. The example (5) 
shows a mention (mid-text) of one of the main characters of a story. Thus, both 
speaker and hearer know, and are assumed to know, the referent. Clearly, a bare 
noun is enough. 


(5) Y-äl-i balum: “kä-x-e tä — och-e tan noj 
ERG3-say-PFV[ABS3] jaguar ERG1-go-IPFV PREP enter-INF PREP big 
bujchach”. 
basket 


»» 


(Alb-m): “The jaguar said: “I am going to enter into the big basket”. 
[chf HT ALB 624 (24:10-24:12), Delgado-Galván 2018] 


“The reader may notice that the nouns yäx and much are preceded by gender classifiers aj- and 
ix-. These are not crucial for the current discussion, as we will observe later in example (5) that 
their absence does not hinder the capability of a noun to be interpreted definitely. 
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As an anonymous reviewer kindly noted, one may wonder whether the NP 
balum has received special treatment or has turned into a proper name in view of 
the mythological character of its referent and its cultural prominence. However 
we can see in example (6) that this is rather the standard treatment of NPs. The 
monkey, ajpum, gets introduced with the numeral ‘one’ (+ classifier), un-tu, and 
then mentioned again later. Once more, a bare noun is enough. 


(6) I  ya-i  wnukt-i un-tu aj-pum. [...] 
and SD-DIST ERG3-find-PFV[ABS3] one-NUM.CLF CLF.M-monkey [...] 
u-k’ech-i aj-pum. [.]i  u-bis-an 
ERG3-grab-Prv [ABs3] CLF.M-monkey [...] and EnG3-bring-1Prv[Ass3] 
aj-pum t-u-chejpa. 
CLF.M-monkey PREP-ERG3-rib 
(Bla-m): He found a monkey. [...] he took the monkey. [...] and brings 
the monkey with him on his side! [chf HS BLA 27-30 (03:41-04:03), 
Delgado-Galván 2018] 


In this case, the extracted example comes from an elicited picture-story and 
thus none of its characters can be assumed to be culturally prominent. Given that 
anaphoric familiarity doesn't seem to trigger the use of ni, one may try to verify 
whether it behaves akin to a weak-type article (Schwarz 2013), with definiteness 
based upon uniqueness. Starting with example (7) we see that the determiner ni 
is not used - and in fact is unnatural to use - in cases of global uniqueness like 
the sun, the moon, etc.” 


>] say “unnatural” rather than “ungrammatical” since Knowles-Berry (1984) provides a counter- 
example, with ni introducing global entities like the sun (i). 


(i) A tik-i ni nok ka ni kin. 
AUX.PFV dry-PFV[ABS3] DET clothes PREP DET sun 
"Ihe clothes dried because of the sun! (Knowles-Berry 1984: 309) 


Still, the more acceptable strategy is to avoid using the determiner ni in this cases, as can be 
seen in another example from the literature below (and is also confirmed by my collaborators 


in the field): 
DU Jik’in a tuts’-i k'ina bix-on tà patan. 
when AUX.PFV appear-PFV[ABS3] sun AUX.PFV go-ABSI PREP work 


"When the sun appeared, I went to work! (Schumann Gálvez 2012: 113) 
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(7) ta ke  t'üb-o (??ni) k'in 
PREP COMP ascend-INF DET sun 
(Luc-m): ‘until the sun rises’ 
[chf_TwoFishingmen_178_(10:10-10:11), Delgado-Galvan 2018] 


Example (8) illustrates the case of uniqueness within a restricted situation, in 
which the determiner ni is not used either. The context of the utterance is one in 
which only one dog (the family dog) is known to be behind the house and it is 
recognized by its barking. 


(8) Ya'an tä woj wichu’ nanti. 
SD EXIST[ABS3] PREP bark dog over.there 


(Mar-m): "Ihe (family) dog is barking over there (behind the house)? 
[My. elicitation, elic_deif_marc_08] 


However, in contrast to (7), the determiner ni is perfectly fine in this context 
and can appear used in such examples, as can be seen in (9). The same translation 
is kept to signal a lack of meaning difference. 


(9 Ya’an tä woj ni wichu’ nanti. 
SD EXIST[ABS3] PREP bark DET dog over.there 


(Mar-m): "Ihe (family) dog is barking over there (behind the house)? 
[My. elicitation, elic deif marc 08b] 


Thus, at least in some contexts of use, there is some freedom as to marking the 
NP with ni. As a matter of fact, a narrative sequence similar to the sequence in 
(4) above, nowadays, would still allow a near absolute absence of ni Although 
Yokot'an has been considered to be a language with a "definite word distinct from 
demonstrative" by Dryer (2005) - probably based upon examination of Knowles- 
Berry's (1984: 209) proposal of ni as a "definite determiner" - a large portion of 
NP instances in Yokot'an which would be translated by a definite noun phrase 
in English fail to have any determiner at all, i.e. they are bare nouns. This points 
to an aspect that complicates the cross-linguistic picture of definiteness. It is the 
non-negligible number of "languages where there is an article that is restricted to 
but not obligatory in definite contexts" (Dryer 2014: e234), i.e. languages which 


* As an example, not a single instance of ni appears in the sample text provided in appendix by 
Knowles-Berry (1984: 371-382). An exception to this scarcity of ni is written Yokot'an, where 
Spanish as a model of literacy exerts an enormous influence and tends to impose the art+N 
nominal template. 
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do have definiteness markers of some kind but whose definite NPs, somewhat 
paradoxically, do not seem to require them in the first place. 

The need for motivation is twofold. Diachronically, the optionality - to vary- 
ing degrees - raises the question of why would a language develop a seemingly 
dispensable marker. Synchronically, the optionality of definite articles raises the 
question about the reason a speaker might use them. Both aspects are linked. 
According to Hawkins (2004: 84), the compelling motivation for the diachronic 
emergence of a definite article from a demonstrative "to express meanings that 
are perfectly expressible in languages without definite articles", originates from 
synchronic processing needs of grammar rather than from semantics or pragmat- 
ics. Interestingly, Givón (2001: 474) points out that “Grammaticalized definite 
markers [...] arise first to mark topical definites”, which implies that nascent 
definite markers do not systematically accompany every NP interpreted as iden- 
tifiable, but rather seem to come associated with a change of discourse-status 
regarding the NP concerned. 

In the next section, I will introduce two notions to capture these two aspects of 
an NP: the Hearer-status (related to identifiability and to the common-ground) 
and the Discourse-status (related to processing and to the referent's status in 
the short-term memory). Under this view, nascent definite markers are better 
seen as some sort of Discourse-status markers which are concerned with the 
optimization of both discourse and utterance processing. It is precisely in this 
way that topicality gets modeled by Centering Theory. In 83, I will introduce this 
theory and use it as a heuristic device to guide our quest for the functionality of 
ni in oral texts. To this end I will apply the theory to a selection of samples from 
oral materials to better understand how attentional shifts in utterance sequences 
affect the likelihood of an NP to be introduced by ni. 


3 Centering Theory and the discourse-management use of 


D 


nti 


3.1 Framework 


Centering Theory, which is a component of a less well-known discourse theory 
from computational linguistics, could be perceived as one more approach to ad- 
dress pronominalization/anaphoric resolution and, in that way, as a competitor 
to other theories addressing the anaphoric properties of NPs. More established 
theories of discourse-oriented analysis of sentences exist, like DRT, but these 
were not originally proposed in order to model attention management (or infor- 
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mation structure) and its interaction with the shape of NPs and their structural 
position in sentences.’ This difference stems from a different approach to the 
dual nature of referring expressions, which can be seen from a semantic or from 
a syntactic viewpoint. From the syntactic viewpoint, referring expressions have 
an impact on sentence linking and processing. From the semantic perspective 
they have an impact, via evoked entities, on the common ground of speaker and 
hearer.® 

Let me explain. NPs can uncontroversially be taken to evoke discourse entities. 
These entities may bear information statuses of different nature. This has been 
noticed - among others - by Walker & Prince (1996: 291-294) which propose to 
distinguish the Hearer-status of a discourse entity from the Discourse-status trig- 
gered by its evoking formal device. I summarize my interpretation of their views 
in Figure 1, below. The Hearer-status is the belief, by the speaker, as to whether 
a discourse entity is known or inferable for the intended audience and thus can 
be assumed to be in the common ground (or not). If it is believed to be known 
or inferable, the NP will tend to be marked as definite, otherwise, as indefinite. 
Under this point of view, definiteness is nothing else than identifiability via gen- 
eral knowledge. But the discourse entities are evoked through formal devices, 
and these formal devices - which can range from full NPs to referential indexes 
in the verb - have formal discourse-properties of their own, regardless of the 
identifiability of the evoked entity. A referring formal device has a potential for 
salience which emerges from its overall structural role in the sentence. Moreover, 
in a sequence of utterances, the same discourse entity might have been evoked 
by devices with different salience. A given level of salience of an NP may affect 
the activatedness of the evoked discourse entity in the next utterance.” 


"In particular, the concepts of topic and focus were not included in the standard format of 
DRT (see Kamp & Reyle 1993: 360, 639). 

*The complementarity of Centering Theory, which emphasizes the first perspective, with other 
approaches that emphasize the second perspective has been noticed by many, with suggestions 
towards integration in Walker & Prince (1996) and Gundel (1998) for the Givenness Hierarchy 
and in Roberts (1998; 2012) for DRT. 

?The term activation is usually preferred within linguistics literature and it is often associated 
with a single Familiarity/Givenness/Accessibility scale for NP classification (cf. Ariel 1990; Gun- 
del et al. 1993; Kibrik 2011), but Walker & Prince (1996: 294) use the term activatedness "or 
Discourse-status" to make clear that they consider givenness and activation as independent, 
orthogonal, scales to be treated separately. Thus, activation usually involves an amalgamated 
scale with givenness, while activatedness is roughly activation considered separately. I stick 
to the latter term since I have based my framework on Walker & Prince (1996). Kantor (1977) 
introduced the term activatedness within computational linguistics covering a loosely similar 
idea. The discussion of similarities and differences in the use of these terms from author to 
author should not concern us here. Since I use Centering Theory to model activatedness, just 
as Walker and Prince (1996) propose, there is no risk of vagueness or confusion in the use of 
this term. 
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Two types of Information status for a discourse entity 


1. HEARER-STATUS (related to givenness and inferability) 
e entity known or inferable (NP coded as definite) 
e entity not known and not inferable (NP coded as indefinite) 


2. DiscouRSE-STATUS (related to activatedness and salience)!° 


a) SALIENCE (upcoming activatedness): The formal sALIENCE of the evok- 
ing NP (or referential index) in the utterance U; currently being pro- 
cessed. It affects the ACTIVATEDNESS of the discourse entity for the next 
utterance U; 3. 


b) ACTIVATEDNESS (former salience): The formal sALIENCE of the evoking 
NP (or referential index) in the utterance U;_, that has been processed 
before the current one. This affects the ACTIVATEDNESS of the evoked 
discourse entity in the utterance U; currently being processed. 


Figure 1: My interpretation of Walker & Prince (1996) 


In other words, an entity evoked by the discourse has two orthogonal, but 
logically independent statuses: a Hearer-status (Is it familiar to the hearer or in- 
ferable?) and a Discourse-status (Is the evoking device formally salient in the 
utterance currently being processed? Was it formally salient in the precedent 
utterance (thus promoting an activated referent in the current one)?). 

In §2 I have shown that the Hearer-status cannot by itself account for inser- 
tions of ni, reason for which I now turn to Centering Theory to inspect how 
the Discourse-status of NPs or, rather, their changes of such status (attentional 
transitions) relate to the presence of the determiner ni. Centering Theory is well 
suited to this aim, since it is precisely an attempt to model the way in which 
the changing salience of referring expressions in an utterance helps to manage 
attention and attention shifts throughout a discourse progression. As such, it is 
also intended to be a component of a larger theory of discourse coherence. 

Discourse typically involves utterances organized in smaller discourse seg- 
ments. Thus, the coherence of a discourse emerges at two levels: between the 
utterances within a single discourse segment (local coherence), and between 
that segment and other discourse segments (global coherence) (Grosz et al. 1983: 
44). Each level of discourse structuring and coherence is associated with a corre- 


Tn Centering Theory, the notions (2a) and (2b) are locally modeled, respectively, by the concept 
Cp(U;) and by the preference for Cb(U;) = Cp(U;.,), these will be presented in 83.2, below. 
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sponding level of attention or focusing:! local attention (or centering) and global 
attention. Centering Theory is devoted to the study of local coherence and the at- 
tentional transitions from one utterance to the next, that is, it is a theory of local 
discourse structure (Grosz et al. 1995; Grosz & Sidner 1998; Walker et al. 1998). 


3.2 Centers of an utterance 


Centering Theory models the contribution of NPs (or, more generally, referential 
indexes) to the coherence of a local discourse segment by recognizing two ways 
in which an utterance affects the structure of a coherent discourse. Both ways 
involve the fact that any utterance U evokes a set of discourse entities which can 
then be used as a cohesive link with adjacent utterances. The first way is by estab- 
lishing a link with the previous utterance through topic continuity. The second 
way is by establishing a discourse entity evoked in the current utterance as the 
default choice for being picked-up as topic by the next utterance. This prospec- 
tive suggestion regarding topicality crucially involves the structural salience of 
a referring device and exploits the relation between the salience and the acti- 
vatedness illustrated in Figure 1 above. When considered in this way, as links 
between adjacent utterances, the discourse entities evoked in U are named the 
centers of U (Grosz et al. 1995: 208). Since all entities in this set can potentially be 
talked about in the next utterance, its members are called forward-looking centers 
(Cf). Among these, an utterance often has a center of attention, a privileged cen- 
ter which constitutes the main link to the previous utterance, i.e. the backward- 
looking center. It roughly can be seen as a special kind of topic: a strictly local 
topic (as opposed to a global topic, which encompasses the entire discourse or 
discourse segment). 

As I anticipated above, one of the main claims of Centering Theory is that 
each utterance has not only a current center of attention (the Cb), but also a 
proposed anticipation of what the center might be in the next utterance (the de- 
fault choice for its Cb), which depends on a ranking of the Cfs according to their 
salience, mostly determined by grammatical structure. For the present discus- 
sion I take grammatical relations as the main ranking factor, as follows: SUBJ > 
OBJ » ADJUNCT. That is, the entities evoked by arguments rank higher up than 


"Grosz et al. (1983: 44) use the term focusing, but to avoid confusions with the more specialized 
use of the term in information structure studies, I will rather speak of attention. Hence I will 
speak of global attention and local attention for what is termed global focusing and local focus- 
ing in the original paper. For a discussion of the relation between the concepts of focusing from 
CT and focus and topic from Information Structure, I refer the reader to Gundel et al. (1993: 279, 
footnote 10). 
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those evoked by non-arguments and, for transitive clauses, the ergative argu- 
ment ranks higher than the absolutive as well. The highest ranked Cf is singled 
out as the preferred center (Cp) which is the default candidate to be the backward- 
looking center (Cb) of the next utterance. To summarize: 


Forward-looking centers (Cf): Cf(U;) - the set of discourse entities evoked by an 
utterance Uj 


Preferred center (Cp): Cp(U;) = the highest ranked element of Cf(U;) in terms of 
salience. 


The Cp constitutes a prediction about the Cb of the following utterance. 


Backward-looking center (Cb): Cb(U;) = the highest ranked element of Cf(U; 4) re- 
alized in U; 


Observe that the Cb(U;) does not coincide with the preferred center Cp of 
U;., when the latter is not evoked in U; (in such case, the next highest ranking 
entity of Cf(U; 4) will be taken as Cb, if evoked). Depending on the continuity or 
disruption between the local topic Cb(U;) and the anticipated topic Cp(U;) of an 
utterance U; or between the local topic Cb(U; 4) of a previous utterance U; ; and 
the one from the current utterance, Cb(U;), we can have several types of center 
attention transitions, which are displayed in Table 1 of the next section. 


3.3 Transitions between utterances 


Since every utterance evokes entities (and therefore has centers), there can be 
continuity of centers from utterance to utterance or there can be shifts of centers. 
Two main parameters govern the quality of a transition from one utterance to 
the next. One parameter is whether both utterances maintain the same local topic 
(Cb) or not (first and second columns in Table 1 below). The second parameter is 
whether the local topic (Cb) ofthe second utterance corresponds to its anticipated 
or suggested topic Cp (upper row in Table 1) or not (bottom row). 

A CONTINUE transition type is the least disruptive one, as the center of atten- 
tion (or roughly the “local topic") in the current utterance does not replace a pre- 
vious one and is additionally set up as the preferred center (Cp), the "anticipated 


For this ranking, which has proven to be accurate enough for my textual analysis, I follow Hed- 
berg (2010: 1837-1838). The segmentation of utterances based upon the logic of clausal units 
rather than pure intonation follows Prince (1999) and Kibrik (2011). 
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Table 1: Center Transitions (Walker et al. 1998) 


Cb(U;.;) = Cb(U;) 
(or Cb(Uj-1) =?) 

Cb(U;) = Cp(U;) CONTINUE SMOOTH-SHIFT 
Cb(U;) # Cp(U;) RETAIN ROUGH-SHIFT 


Cb(Ui-1) + Cb(Ui) 


or suggested topic” for the next utterance.? According to this model, a maxi- 
mally gradual change of attention ideally would involve a sequence of two tran- 
sitions (so: minimally two utterances), one RETAIN transition which anticipates 
a shift in topic and one SMOOTH-SHIFT transition which executes it. However, 
more abrupt shifts can involve both transitions compressed and collapsed into a 
single transition executed within a single utterance: the ROUGH-SHIFT transition, 
which would naturally be expected to invite the use of the most marked struc- 
tures. Centering Theory has the ordering rule in Figure 2 reflecting the intuition 
that speakers try to maximize coherence and that these transitions are increas- 
ingly less coherent (or, equivalently, coherent at a higher processing cost). 


Transition states are ordered: 
CONTINUE > RETAIN > SMOOTH-SHIFT > ROUGH-SHIFT 


Figure 2: Ordering rule (Walker et al. 1998) 


One limitation of the basic format of Centering Theory presented above is that 
it deals with transitions within topical chains (conceived as chains of utterances 
where pairwise sharing of at least one center is maintained and thus Cb(U;) is 
always available). Not much is said about utterances lacking a Cb (Cb - none or, 
equivalently, Cb - ?) which are the utterances that start a topical chain, either 
because they are absolute discourse-initial, or because they don't share any of 
its centers with the previous utterance (Hedberg 2010: 1831). 

For the present discussion, all I need is to complement Table 1 with Table 2, 
below. 


The informal expressions local topic and anticipated or suggested topic are mine. They are just 
intended to guide the intuition of a reader who has no prior contact with Centering Theory. 
"Walker et al. (1998) label these transitions simply as No cB transition. But then both kinds of 
topical chain starts would be collapsed. Intuitively, it is a more drastic shift to ignore all the cen- 
ters introduced by a previous utterance than to start a discourse with no previously specified 
information in the background. Some further refinements and a classification of the transitions 
with the parameter (Cb - ?) have been proposed, see Poesio et al. (2004) and references therein. 
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Table 2: Center Transitions for chain-initial U; (Poesio et al. 2004; Hed- 
berg 2010) 


Cb(Uj.1) =? Cb(Uj1) = c 
Cb(U;) = ? NULL ZERO (x ROUGH-SHIFT) 


The row represents the chain-initial utterance U;, where chain-initial is taken 
as the fact of not having a backward-looking center Cb (Cb - ?). The first column 
represents the situation in which the previous utterance is also “chain-initial”. 
The special case where there is no previous utterance is not of importance here. 
The second column represents the case where the previous utterance had a Cb 
(Cb = c, for some entity c), and it was ignored by the current utterance. This case, 
the ZERO transition, is really some kind of shift so I will treat it as a special case 
of ROUGH-SHIFT transition, see example (20) further down. Observe that when 
Cb = ?, neither (Cp = Cb) nor (Cb = previous Cb) are true (which for all practi- 
cal matters, almost boils down to Cp # Cb and Cb # previous Cb). Furthermore, 
in the case of the ZERO transition, it is known for sure that the Cb and the Cp 
from the previous utterance exist and have been ignored. So I will consider this 
as a degenerate case of ROUGH-SHIFT transition, reason for which I added this 
consideration in Table 2. It is more disruptive than RoUGH-SHIFT proper, since it 
entirely dismisses the centers from the previous utterance. I expect it to invite 
even more the use of non-neutral constructions. 

I will now illustrate centers and center transition types with a sequence of 
contiguous utterances in Yokot'an from the Frog Story elicitation task (examples 
(11-17) below). The utterance (11) has the kid (yokajlo’) as backward-looking 
center (Cb), given the previous context - omitted — which offers yokajlo’ as an- 
tecedent of the absolutive person mark of all verbs in (11).? Moreover, the kid 
(yokajlo’) is also the highest ranked forward-looking center, given its status as 
subject-argument (it is thus the preferred center Cp from all centers in the set 


Cf). The centers of the utterance (11) can thus be represented as follows: 


PCenters are in boldface to remind the reader that these are not the linguistic expressions but 
the entities realized by them. 

16] first display the backward-looking center (Cb). Then I display the set of Cfs by ranking order, 
with its first member being the preferred center (Cp). Finally, I display the two parameters 
that determine the transition type. For the sake of simplicity and to avoid overloading the 
exposition, I will disregard many details which do not affect my analysis in crucial ways. For 
example, I disregard the fact that some examples like (11), include in fact two utterances of 
which the second is a re-elaboration, and I will also skip the details about how backgrounded 
clauses like e.g. the temporal clause k’echi’ ak’äb in (11) are treated. 
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(10) 


(11) 


Utterance: (11); 

[Cb(yokajlo’), Cf(Cp(yokajlo’) > ak’äb)]; 
Cp = Cb; 

Cb = previous Cb; 

Ct: CONTINUE 


Ke  ya'a bix-i tä — wáy-e. — K'ech-i 

COMP SD AUX.PFV go-PFV [ABs3] PREP sleep-INF ERG3-grab-PFV[ABs3] 
ak’äb, bix-i tä — wdy-e. 

night go-Prv[ABs3] PREP sleep-INF 

(Esm-f): “Then he [the kid] went to sleep. When the night reached him, 
he [the kid] went to sleep: 

[chf FrogStory ESM 006 (00:43-00:47), Delgado-Galván 2018] 


The example (11) is, in fact, a CONTINUE transition with respect to the previous 
(not presented) context. Consequently the backward-looking center is evoked 
through the most reduced referential form: a personal index in the verb, which 


in this case is actually an implicit ABs3 index. This is a reminder that centers 
are often realized by reduced referential devices. The following utterance in the 


narrative, (13), has the following centers and center transition (Ct): 


(12) 


(13) 


Utterance (13): 

[Cb(yokajlo’), Cf(Cp(yokajlo’) > wichu’ > ts'en)]; 
Cp = Cb; 

Cb = previous Cb; 

Ct: CONTINUE 


De y-o=ba way-e, de boo [u-Jjin bix-i 

PREP ERG3-want- TOP sleep-INF PREP tired ERG3-feeling go-Prv[ABs3] 

tä — wáy-e une=a pan u-ts’en, dok yok wichu’. 

PREP sleep-INF PRO3=TOP PREP ERG3-bed cow little dog 

(Esm-f): ‘He wanted to sleep, of tiredness he went to sleep on his bed, 
with his dog’ [chf_FrogStory_ESM_007_(00:48-00:54), Delgado-Galvan 
2018] 


Since the backward-looking center (Cb) of (11) and (13) is the same, and the 
preferred center is also shared, there is full continuity respect to which center 
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gets most attention and will preferentially get attended to on the next utterance. 
This illustrates the CONTINUE type of transition between utterances. However, 
the next utterance (15) in the sequence starts to introduce a shift. While (15) and 
(13) keep sharing the same Cb, (15) introduces a new Cp, the frog (much), which 
announces a future shift in center of attention (a shift in "local topic"). (15) has 
the following centers and center transition type: 


(14) Utterance (15): 
[Cb(yokajlo’), Cf(Cp(much) > yokajlo’)]; 
Cp # Cb; (this announces a future shift of “local topic”) 
Cb = previous Cb; 


Ct: RETAIN 

(15) Ix-much-'a | u-chün-i ke a wäy-i 
CLF.F-frog- TOP ERG3-see-PFV[ABS3] COMP AUX.PFV sleep-PFV[ABS3] 
yokajlo’=ba. 
kid=ToP 


(Esm-f): "Ihe frog saw that the kid was asleep: 
[chf FrogStory FSM 009 (00:57-01:00), Delgado-Galvan 2018] 


The frog (much) is highest in salience ranking than the kid (yokajlo’) due to 
the fact that it is evoked by an NP associated to the structural role of subject of 
the transitive main clause, while yokajlo' is evoked as intransitive subject of an 
embedded clause. The fact that much is evoked by the ERG3 index of the main verb 
makes it the Cp, but the fact that it is also mentioned with a full NP with a topic 
marker ba can be blamed on the fact that Cp + Cb. As we will see later ($3.4), at 
this point we could have had the determiner ni introducing the NP ixmuch either 
redundantly with ba or without it." This illustrates the RETAIN type of transition 
between utterances, which retains the local topic (yokajlo’), but announces its 
demise. With the next utterance (17) in the sequence, I illustrate the smooTH 
SHIFT type of transition, which executes the Cb-shift that was prepared in (15). 
The utterance (17) has the following centers, and center transition type: 


(16) Utterance (17): 
[Cb(much), Cf(Cp(much) » traste)]; 
Cb = Cp; 
Cb # previous Cb; (this executes the "local topic” shift) 
Ct: SMOOTH SHIFT 


"This has been also confirmed by my collaborators in the field. 
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(17) U-ch-i aprobecha une, a pas-i tan 
ERG3-make-Prv [ABs3] advantage PRO3, AUX.PFV exit-PFv[ABS3] PREP 
traste bajka ya’ an-'a. 
jar where SD EXIST[ABS3]=TOP 
‘(Esm-f): She [the frog] took advantage, she [the frog] went out of the 
bottle where she was? 

[chf FrogStory FSM 010-(01:00-01:05), Delgado-Galvan 2018] 


Since the backward-looking center (Cb) of (15) and (17) are different, the shift 
in center of attention that was anticipated with RETAIN in (15) is now completed 
in (17). It is interesting to note that the frog (much) is evoked by a highly salient 
formal device in both (15) and (17), namely the indexes for transitive and intran- 
sitive subject, but only in (15) is a full NP used in preverbal position and with 
a topicality marker. The first observation is linked to the fact that Cp(15)- much 
and Cp(17)2 much. The second fact (the use of a ba-marked NP) is linked to the 
switch represented by much=Cp(15) # Cb(15)=yokajlo’. This should draw our at- 
tention to the fact that under this model and analysis, ba-marked NPs as the one 
above do not flag topicality of a discourse entity as such, but the switch of topi- 
cality, i.e. the transitions characterized by a rupture Cp # Cb (and, perhaps, the 
fact that the Cp has just been introduced into the Cf set of the utterance without 
having been present in the Cf of the previous utterance). 

After such shift of center in two steps, namely a RETAIN (15) plus a SMOOTH 
SHIFT (17) transition, the flow of local attention proceeds with minimal distur- 
bance. The example (19) preserves the same centers than (17) and moreover does 
not anticipate or announce any future shift. We do have again a CONTINUE tran- 
sition type, the least “marked” of all. 


(18) Utterance (19): 
[Cb(much), Cf(Cp(much))]; 
Cp = Cb; 
Cb = previous Cb; 
Ct: CONTINUE 


(19) A pas-i ya-i  puts-i pues, a 
AUX.PFV exit-PFV[ABS3] SD-DIST escape-PFV[ABS3] so AUX.PFV 
bix-i. 
go-PFV[ABs3] 

(Esm-f): ‘She [the frog] escaped from there, she ran out, she left! 
[chf FrogStory ESM 011 (01:05-01:08), Delgado-Galván 2018] 
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Now I illustrate the most complex transition type called ROUGH SHIFT, which, 
as its name suggests, introduces an unannounced shift of center, in this case, in 
favor of the discourse entity yokajlo’, ‘kid’. Observe how the evoking device, the 
NP yokajlo’, is bearing a topic marker ba. 


(20) Ya-i a ch'oy-i isapan yokajlo’=ba. 
SD-DIST AUX.PFV wake-pFv[ABs3] morning kid=ToP 


(Esm-f): “Then the kid woke up in the morning: 
[chf FrogStory ESM 012 (01:09-01:11), Delgado-Galván 2018] 


No center from the previous utterance is evoked, thus there is no backward- 
looking center in the current utterance and the new preferred center, the kid 
(yokajlo’), is introduced without any anticipation whatsoever in the previous 
utterance. 


(21) Utterance (20): 
[Cb(?), Cf(Cp(yokajlo’) > isapan)] 
Cb = ?; 
previous Cb = much; 
Ct: ZERO; 
|| A special case of 
Cp # Cb; 
Cb # previous Cb; 
Ct: ROUGH SHIFT 


A few notes are in order to draw attention to something that the reader might 
have already deduced. First, the utterance-topic conceptualized as Cb (the “lo- 
cal topic”) is dependent on the centers of the previous utterance. The very same 
sentence may or may not have such a "topic" depending on which entities have 
just been evoked in the previous utterance (a case in point is the jump from 19 to 
20). As such, Cb is clearly a relational-discourse dependent notion (as opposed to 
Cp which is more closely dependent on the shape of the utterance). Second, this 
notion of topic and center of attention is strictly local: it concerns the immedi- 
ately preceding utterance within a given discourse segment. Thus, a given entity 
evoked by an NP might be globally topical (in the sense that the global discourse 
attention is directed to it) without being locally topical, i.e. without being the Cb 
of an utterance. To resume a reference across a local transition (of different sorts) 
within the same discourse segment and to resume the same reference across dif- 
ferent discourse segments, at the level of global discourse, is likely to involve 
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different linguistic resources, but might also involve a great deal of overlap as to 
which resources are used. 

The utterances (15) and (20) represent transitions that are obviously increas- 
ingly less neutral than the default CONTINUE, they involve an anticipation of a 
shift and a sudden shift, respectively, in the center of attention (Cb). It is no coin- 
cidence that more complex constructions are used at this point: in the utterance 
(15), the NP anticipating the center shift (ixmuch’a) is bearing the topic marker ba 
(with allomorphic 'a) and is occupying initial position, while the NP whose top- 
ical demotion is anticipated is also bearing the topic marker ba. In the utterance 
(20) the NP realizing the center which is promoted to default preference is again 
bearing the topic marker ba.'® In 83.4 below, I will show that ni and ba share this 
discourse management functionality in the domain of attention transitions. 


3.4 The overlap of ni and ba as NP marking devices 


Let us now see in (22-24) what happens when an entity is introduced, not as local 
topic, but as global discourse topic. These utterances show the beginning of an 
interview with a traditional drum-maker, Alberto (Alb-m). At the beginning of 
the interview, Bernardino (Bern-m) directs his attention to the camera to explain 
what the video recording session will be about, in example (22). 


(22) Une-ba u-ch-en joben i bada=ba kä-x-e 
PRO3=TOP ERG3-make-iPrv[ABs3] drum and now=TOP ERGI-go-IPFV 
kä-k’at-b-en-la. 

ERG1-ask-BEN-IPFV[ABS3 ]-PL.INCL 
(Bern-m): ‘He makes drums and now we are going to ask him: 
[chf HT ALB 7 (00:09-00:12), Delgado-Galvan 2018] 


Notice that joben, ‘drum’, appears as a bare noun object. After asking for the 
full name of the drum-maker and his professional activity, the next utterance (23) 
is now directed to start the main interview on drums. Now the NP joben bears 
the determiner ni and is being set up as the main topic of the global discourse. 


(23) Kachka=da u y-ut-e ni joben? Kachka 
Q-PROX AUX.IPFV ERG3-build-iPFv[ABS3] DET drum Q 
u-täk’-an? Kachka u-xup-o? 

ERG3-start-IPFV[ABS3] o ERG3-finish-IPFV[ABS3] 


(Bern-m): ‘How is the drum made? How is it started? How is it finished?’ 
[chf HT ALB 27-28 (00:35-00:39), Delgado-Galvan 2018] 


Note that the particle ba is quite multi-functional and flagging topicality-shifts would be only 
one of its possible contributions in the language. 
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The reply of the drum-maker validates joben as the main topic of the global 
discourse, in (24). 


(24) Ni joben kä-täk’-e’ kä-jok’-än dok formon. 
DET drum ERGI-begin-ıprv[ABs3] ERG1-dig-1PFv[ABs3] com chisel 


(Alb-m): “The drum, I start to dig it out with the chisel’ 
[chf HT ALB 29 (00:40-00:42), Delgado-Galván 2018] 


It would then seem that a function of ni is to label an NP as evoking a center 
that constitutes a main global topic rather than just a local topic. But in fact, ni 
can serve the same purpose that the topic marker ba fulfilled in the examples 
(15) and (20) above as a facilitator of center shifts. I illustrate this with an extract 
from the Frog Story narrative task. After narrating how the kid of the Frog Story 
arrived at the tree and climbed on it, the storyteller announces a center shift as 
follows (notice that ni contracts to n before vowels in fast speech). 


(25) kani aw-ál-e a pas-i tünxin te'-ba? 
Q  ERG2-say-IPFV[ABS3] AUX.PFV exit-PFv[ABs3] middle tree=Top 
A pas-i n-aj-xoch’. 
AUX.PFV exit-PFV[ABS3] DET-CLF.M-owl 
(Esm-f): ‘and who do you think came out from the middle of the tree? 
The owl came out!’ 
[chf FrogStory FSM 61-63 (03:42-03:49), Delgado-Galván 2018] 


While in (15-17) we had a RETAIN transition followed by a SMOOTH SHIFT, here 
it is a rhetoric question, rather than a RETAIN transition that prepares the cen- 
ter shift in the utterance that answers the question. The type of transition of the 
rhetoric question of (25) is a ROUGH SHIFT transition which is then followed by 
a RETAIN transition in the answer to the question. The ROUGH SHIFT is caused by 
the abrupt replacement of the kid as local topic of the previous context by an 
entity-variable evoked by the interrogative pronoun. Then the RETAIN transition 
reflects the fact that the attention is upon the subject interrogative pronoun and 
upon its value in the answer, the owl. Since the ROUGH-SHIFT introduces an in- 
terrogative pronoun as a dummy topic, in the sense that it is a variable, the real 
topic introduction happens when the value ofthis dummy topic is revealed, in the 
answer. The question pronoun simply removes the currently activated discourse- 
entity (the kid) from the center of attention while the subject NP in the answer 
fills in the corresponding empty spot with the help of a RETAIN transition. So 
the topic-shift is somehow delayed until the second utterance of (25), it is there 
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where the discourse entity ajxoch' is evoked. What matters for the discussion is 
that it is precisely at this point where the determiner ni appears decorating the 
NP, flagging a shift of center aimed at the owl. The next utterance (27) indeed 
has the owl as Cb and Cp, evoked by the indexes in the verbs, thus displaying a 
CONTINUE transition. 


(26) Utterance (27): 
[Cb(ajxoch’), Cf(Cp(ajxoch’) > yokajlo’)] 
Cp = Cb; 
Cb = previous Cb; 
Ct: CONTINUE 


(27 A pas-i-'a u-bwejtes-i yokajlo’. 
AUX.PFV exit-PFV[ABS3]=TOP ERG3-scare-PFV[ABS3] kid 
(Esm-f): “He [the owl] came out and scared the kid? 

[chf FrogStory ESM 64 (03:50-03:53), Delgado-Galván 2018] 


On the next utterance (29), however, the attention is directed to the least high- 
ranked forward-center of (27): the kid, yokajlo', without any allusion to the 
owl. This SMOOTH-SHIFT transition prompts the use of a non-neutral construc- 
tion, with a preposed subject NP bearing a topic marker ba. 


(28) Utterance (29): 
[Cb(yokajlo’), Cf(Cp(yokajlo’) > iski)] 
Cp = Cb; 
Cb # previous Cb; 
Ct: SMOOTH SHIFT 


(29 De ya’-i  yokajlo'-bade iski a yál-i une. 
PREP SD-DIST kid=TOP PREP above Aux.Prv fall-Prv[ABs3] PRO3 
(Esm-f): “Afterward the kid fell from above? 

[chf FrogStory ESM 65a (03:53-03:56), Delgado-Galván 2018] 


In the next few lines, the narrative describes how the dog passes nearby run- 
ning away from a swarm of wasps. Because the dog (wichu’) is the main player 
in the immediately previous context to example (31) below, and it is evoked there 
again by an ERG3 index in the locative relative clause, it constitutes the backward- 
looking center of (31). However, it is evoked in an embedded position and the kid 
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yokajlo’, being the subject of the main clause, is set up as the preferred center. 
There is an attention shift in progress. 


(30) Utterance (31): 
[Cb(wichu’), Cf(Cp(yokajlo’) > wichu’)] 
Cp # Cb; 
Cb # previous Cb; 
Ct: ROUGH-SHIFT 


(31) De wi  káda an [u-]bix-e ta — puts’-e, yokajlo’ 
PREP SD-DIST where EXIST[ABS3] ERG3-go-IPFV PREP escape-INF kid 
täkä pas-i tä — puts’-e. 


also exit-PFv[ABs3] PREP escape-INF 

(Esm-f): ‘Afterward, where he [the dog] passed escaping, the kid also 
passed escaping. [chf FrogStory ESM 67 (04:02-04:06), Delgado-Galván 
2018] 


While yokajlo' is not decorated in any way, it is in preverbal position, which 
adds to its salience-related position. However, the narrator immediately re-elabo- 
rates the main clause of (31) as the utterance in (33) and frames yokajlo' with both 
particles ni and ba. 


(32) Utterance (33): 
[Cb(yokajlo’), Cf(Cp(yokajlo’))] 
Cp = Cb; 
Cb # previous Cb; 
Ct: SMOOTH-SHIFT 


(33) A k’ot-i tä — puts-e ni yokajlo'-ba. 
AUX.PFV arrive-PFV[ABS3] PREP escape-INF DET kid=TOP 
(Esm-f): “The kid arrived escaping: 
[chf_FrogStory_ESM_68a_(04:07-04:08.5), Delgado-Galvan 2018] 


A new shift comes with the utterance (35) where the kid has been reduced in 
salience, being evoked by the ABs3 index of the transitive verb form ubwät’esi, 
while the owl (najxoch’) is evoked twice, by two NPs in the salient position of 
subject of the verb. First the narrator evokes 'the owl’ with an NP composed of the 
general term for ‘bird’, modified by a relative clause (ni mut jini ka ubwát'esiba, 
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“the bird that scared him’), and then the speaker zooms in on the word she was 
looking for: ajxoch' the owl. To signal such transition from the kid to the owl as 
the preferred center, both NPs evoking the owl are accordingly introduced by ni. 


(34) Utterance (35): 
[Cb(yokajlo’), Cf(Cp(mut=ajxoch’), yokajlo’)] 
Cp # Cb; 
Cb = previous Cb; 
Ct: RETAIN 


(35) Porke [u-]num-e u-ch-en segui ni mut jin-i 
CONJ ERG3-pass-IPFV ERG3-do-IPFV[ABS3] follow DET bird DEM-DIST 
kä | u-bwát'es-i-'a n-aj-xoch’=ba. 
COMP ERG3-scare-PFV[ABS3]=TOP DET-CLF.M-owl=TOP 
(Esm-f): “Because he was following him, the bird that scared him, the 
owl? [chf_FrogStory_ESM_68b_(04:06-04:14), Delgado-Galvan 2018] 


On several occasions the functional overlap of the determiner ni and the en- 
clitic ba is apparent. Either because one appears instead of the other (25-29) or 
because they co-appear on the same NP, as in example (33-35). The overlap and 
competition between ba and ni to mark transitions in NP salience can be nicely 
observed in the following self-correction. 


(36) Mach kumpale peru tákà ni bit anima-jo’, täkä bit buch’-jo’=ba, 
NEG buddy but also DET small animaler also small fish-PL=TOP 
ejte jits’-o’ täkä. 

FIL be.hungry[Ags3]-pı also 

(Luc-m): ‘No, buddy, but also the little animals, also the little fish are 
hungry as well! 

[chf TwoFishingMen LUC. 038-39-40 (02:02-02:09), Delgado-Galvan 
2018] 


In example (36) we can observe two mentions of the same referent (the fishes 
being fed by one of the participants) with alternate NPs and parallel discourse 
statuses. Interestingly, one of the alternatives is introduced by ni while the other 
alternative is bearing the topic marker ba instead. The reason of the rephrasing 
is evidently a rectification of the description, replacing the vague bit animajob 
(little animals’) with the more precise bit buch’jo’ (little fish’), but along the 
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correction the speaker inadvertently switches from using ni to using ba for an 
identical discourse status of the NP. 

Inow turn to a sample extracted from an interview to illustrate the association 
of ni with center transitions outside a narrative monologue. In this interview, the 
chapel's president, Felipe (Fel-m), explains many details of the festivities related 
to the agricultural cycle and to Santiago Apóstol. To properly understand the ex- 
change that follows, the reader should be aware of the following cultural facts: 
in Yokot'an festivities, three different types of musical ensembles can be encoun- 
tered with different roles. In the interview selection shown below, the attention 
switches from one to another type of musical ensemble regarding the question 
whether they get any payment for their performance." After explaining how the 
main festivity will take place, Felipe (Fel-m) adds a final comment on how in for- 
mer times the musicians, musiku, would get paid, and how eventually drummers, 
ajjobeno’, would show up (37). 


(37 De ke  ajn-i-ba u-toj-e'-o' musiku 
PREP COMP be.located-prv[ABs3]=TOP ERG3-pay-IPFV[ABS3]-PL musician 
i abeses y-ajn-e aj-joben-o’. 
and sometimes ERG3-be.located-IPpFv[ABS3] CLF.M-drum-PL 
(Fel-m): ‘As it was before, they would pay the musicians and sometimes 


the drummers would attend’ 
[chf_CONV_FEL_219-(08:21-08:24), Delgado-Galvan 2018] 


These are not kept as topics since, immediately after this comment, the con- 
versation goes on to explain other aspects of the festivity. Nevertheless, further 
ahead - more than twenty lines later - the interviewer, Argelia (Arg-f), brings 
back the theme of the musicians and drummers and asks about whether they are 
paid nowadays - example (39) -, reintroducing them with ni in subject position 
of a passive sentence, i.e. as preferred centers for the next utterance while they 
were not evoked in previous sentences (we have a ROUGH-SHIFT transition). Ob- 
serve that, since the interviewer completely changes the subject matter, there is 
no backward-looking center: no entity from the previous utterance is retaken in 
the current one. 


The loanword musiku from the Spanish word for musician músico refers to musicians playing 
European instruments (e.g. the snare drum, the bass drum and the saxophone). Besides this 
ensemble, two types of native ensembles perform. The terms joben and ámay (and their deriva- 
tives, ajjoben and ajämay which refer to the corresponding musicians) refer to double-sided 
drums and a cane flute respectively. Finally, the terms tunkul and pochó refer to a special slit 
log drum and a wax-headed flute, respectively, and which also form a special ensemble. 
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(38) Utterance (39): 
[Cb(?), Cf(Cp(«ajmusiku, ajjoben>))]; 
Cp # Cb; 
Cb # previous Cb; 
Ct: ZERO («ROUGH-SHIFT) 


(39 I u-toj-k-an kärake  jin-i n-aj-joben-o' 
and ERG3-pay-PASS-IPFV Q COMP DEM-DIST DET-CLF.M-drum-PL 
n-aj-musiku? 
DET-CLF.M-musician 
(Arg-f): ‘And do the drummers, the musicians, get paid?’ 
[chf CONV FEL 247 (09:17-09:20), Delgado-Galván 2018] 


Felipe (Fel-m) selects a subtopic as backward-looking center, the musicians, 
and answers about them that they are paid, in example (41). 


(40) Utterance (41): 
[Cb(ajmusiku), Cf(Cp(ajmusiku))]; 
Cp = Cb; 
Cb # previous Cb; 
Cb c previous Cb; 
Ct: CONTINUE Or SMOOTH SHIFT 


(41) Aj-musiku u-toj-k-an une. 
CLF.M-musician ERG3-pay-PASS-IPFV PRO3 
(Fel-m): "The musicians, they get paid: 
[chf CONV FEL 248 (09:19-09:21), Delgado-Galván 2018] 


The interviewer (Arg-f) now reselects in example (43) the drummers as center 
of attention, provoking a ROUGH-SHIFT and as expected the NP is decorated with 
ni, and in fact also with ba. Observe that due to the lack of competition with any 
other referential device, the sole referential device of the utterance gets maximal 


2°Since the interviewer puts forward a question about two discourse entities in (39), the tran- 
sition type of (41) is not exactly represented by the available types, but would rather be an 
intermediate case between SMOOTH SHIFT and CONTINUE, since the Cb is not identical to, but 
it is included in the previous Cb. It is possible to classify the transition as a CONTINUE, if the 
inclusion (Cb c previous Cb) gets emphasized, or as a SMOOTH-SHIFT if the inequality (Cb # 
previous Cb) gets emphasized. These details are not important for the aim of our discussion. 
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salience and thus its evoked entity turns into the preferred center Cp of the ut- 
terance. No Cb exists since no entity from the previous utterance (41) is evoked 
again in (43). 


(42) Utterance (43): 
[Cb(?), Cf(Cp(ajjoben))] 
Cp # Cb; 
Cb # previous Cb; 
Ct: ZERO («ROUGH-SHIFT) 


(43 i ni aj-joben-ob=ba? 
and DET CLF.M-drum-PL-TOP 
(Arg-f): ‘and the drummers?’ 
[chf_CONV_FEL_249_(09:21-09:23), Delgado-Galvan 2018] 


Felipe (Fel-m) accordingly accepts the local-topic switch and answers about 
the drummers. Observe how on both examples (43) and (45) the NP is decorated 
in an identical way, first as flagging of a ROUGH-SHIFT and then as an acceptance 
of it. 


(44) Utterance (45): 
[Cb(ajjoben), Cf(Cp(ajjoben) > payment) 
Cp = Cb; 
Cb = previous Cb; 
Ct: CONTINUE 


ES 


(45) N-aj-joben-ob=ba igual täkä u-toj-k-an peru une mach 
DET-CLF.M-drum-PL-TOP same also ERG3-pay-PASS-IPFV but PRO3 NEG 
y-o u-ch’-e-jo’. 
ERG3-want ERG3-take-1PFV[ABS3]-PL 
(Fel-m): "Ihe drummers, are also paid, but they don’t want to take it? 
[chf CONV FEL 250 (09:23-09:26), Delgado-Galván 2018] 


The following sequence of utterances (separated by commas under the same 
example 47 and with transitions labeled as 47a, 47b and 47c) maintains the same 
local topic (Cb) and the same anticipated topic (Cp). Accordingly, the drummers 
are evoked as minimally as usual in these cases: with the person markers on the 
verb only, without using an NP introduced by ni. 


ZA payment of some kind is evoked by the ABs3 index from uch’ejo’. 
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(46) a. Utterance (47a), corresponding to si uk’atäno’ chich: 

[Cb(ajjoben), Cf(Cp(ajjoben) > payment)]?? 
Cp = Cb; 
Cb = previous Cb; 
Ct: CONTINUE 

b. Utterance (47b), corresponding to peru pekenia koperasion ubintejo’ne: 
[Cb(ajjoben), Cf(Cp(ajjoben) > koperasion)] 
Cp = Cb; 
Cb = previous Cb; 
Ct: CONTINUE 

c. Utterance (47c), corresponding to mach uk'atünjo' pwej una kantidad: 
[Cb(ajjoben), Cf(Cp(ajjoben) > kantidad)] 
Cp = Cb; 
Cb = previous Cb; 
Ct: CONTINUE 


(47) Si u-k’at-än-o’ chich, peru pekenia koperasion 
yes ERG3-ask-IPFV[ABS3]-PL true but little contribution 
u-b-int-e-jo’=ne, mach u-k’at-än-jo’ pwej 
ERG3-give-PASS-IPFV[ABS3]-PL=PRO3 NEG ERG3-ask-IPFV[ABSs3]-PL thus 
una kantidad. 
one amount 


(Fel-m): "They ask, yes, but they are given a small contribution, they don't 
request a (fixed) amount: 
[chf CONV. FEL 251-252 (09:26-09:32), Delgado-Galván 2018] 


But then the interviewer (Arg-f) switches once more the center of attention, 
now to request information on the last type of musical assembly (tunkul-pocho 
musicians), performing a ROUGH-SHIFT transition on example (49). 


(48) Utterance (49): 
[Cb(?), Cf(Cp(tunkul-pocho musicians))] 
Cp # Cb; 
Cb # previous Cb; 
Ct: ZERO («ROUGH-SHIFT) 


?! Again, a payment of some kind is evoked by the ABs3 index from uk’atäno”. 
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(49) i ni jin u-jäts’-e’ esteni tunkul i 
and DET DEM ERG3-hit-iPFv[ABS3] FIL DET slit.log.drum and 
pocho-ba? 


wax.headed.flute=Top 


(Arg-f): ‘and those who play the tunkul and the pocho?' 
[chf CONV FEL 253 (09:33-09:36), Delgado-Galván 2018] 


Observe how this is a more complicated NP than the one used in (43), above. 
The determiner ni is marking a relative clause 'those who.. (jin ujáts'e' ni tunkul 
i pochoba), but it is also introducing the NP tunkul inside the clause. Again we 
find ba seemingly playing a similar or complementary role to ni in the context 
of a ROUGH-SHIFT transition. 

From the kind of data presented in this section, I conclude that ni has a function 
related to topicality-shifting. In particular, it seems to flag mostly ROUGH-SHIFT 
transitions (including ZERO transitions), and occasionally RETAIN transitions. The 
fact that attentional shifts can be performed by a sequence of two transitions, 
with the first preparing the second, complicates the assessment of these results. 
For example, in the case of ni both cases of RETAIN are teamed-up with a previ- 
ous transition. Also (33) is technically a sMoOTH-SHIFT transition, but it could be 
counted here as a ROUGH-SHIFT transition, because it is a rephrasing of a previous 
utterance whose transition belongs to this category. Thus I interpret the rephras- 
ing in (33) as a correction or reinforcement rather than a genuine new transi- 
tion.?? Therefore I assign to (33) the same transition category than the previous 
utterance. In the case of repetitions of an NP as acceptance of a ROUGH-SHIFT 
transition proposed by another speaker, ni can appear in CONTINUE transitions 
(see the sequence 43-45). I do not display such repetition-cases in Figure 3. 

Most of these ni and ba insertions in NPs seem to involve a transition in which 
Cp # Cb. Regarding the overlap of function between ni and ba, it is beyond the 
scope of this study to establish whether there are differences between them (if 
any) in these contexts. The main point of these distributional analogies is to make 
a stronger case for ni to be a transition discourse-marker. Now that I have es- 
tablished a discourse-management basis for the use of ni, I will link, in §4, the 
synchronic array of its uses to the diachronic picture of ni as a development of 
the demonstrative jini. This will not only clarify the status of ni as a nascent def- 
inite marker but will also throw light on two apparently disparate observations 
in the grammaticalization literature of articles. The section begins with a very 
brief display of the diachronic evolution of ni as proposed in the literature. 


23 Adopting such a view would imply that sequences of utterances of which the second is a 
correction, re-elaboration or rephrasing of the first should be treated differently than regular 
sequences. 
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Center transitions correlated to ni 


RETAIN transition [Cp + Cb; Cb = previous Cb]= (25) and (35). 


SMOOTH-SHIFT transition [Cp = Cb; Cb 7 previous Cb]=> (33). 


ROUGH-SHIFT transition [Cp + Cb; Cb + previous Cb] (33). 


ZERO = ROUGH-SHIFT transition [Cp + Cb; Cb + previous Cb, by lack of Cb 
-degenerate case-]— (39), (43) and (49). 


Center transitions correlated to ba 


RETAIN transition [Cp + Cb; Cb = previous Cb] (15), (35). 


SMOOTH-SHIFT transition [Cp = Cb; Cb + previous Cb] (29). 


ROUGH-SHIFT transition [Cp + Cb; Cb + previous Cb] (33). 


ZERO = ROUGH-SHIFT transition [Cp + Cb; Cb + previous Cb, by lack of Cb 
-degenerate case-]— (20), (43), (49). 


Figure 3: Transition-motivated framing of NPs with ni and ba 
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4 Ni from demonstrative to article 


I mentioned earlier that ni is likely a recent innovation. While variants of the 
distal demonstrative jini are attested in late epigraphic writing on pottery (Mora- 
Marín 2009: 114, 120-121) and in the only known colonial text of Yokot'an, dating 
from 1610-1612, the Maldonado-Paxbolon-Papers (Smailus 1975), the determiner 
ni is not found on historical records.?* Mora-Marín (2009: 120-121) makes the 
explicit claim that ni grammaticalized from jini (Figure 4). 


Proto-Mayan = Proto-Ch’olan =  Proto-Western Ch'olan =  Yokot'an 
hini 
*ha’+in => *ha’in+i => *hin+i => Į 
ni 


Figure 4: Reconstruction of the sources of jini and ni 


This diachronic axis linking ni to the distal demonstrative jini allows me to 
exploit grammaticalization theory.? The grammaticalization approach and the 
development paths it suggests provide a detailed typological grid to classify and 
understand the functioning of article-like forms in under-described languages 
(Himmelmann 2001: 832). For this reason I provide in $4.2 a brief overview of 
the grammaticalization paths of articles from demonstratives proposed in the 
literature, as these developments are directly relevant to the forms available in 
Yokot'an. Each stage or transition between stages also helps to crystallize partic- 
ular sets of uses of a form in a given language. Since ni presumably originates in 
the demonstrative jini, and given its main discourse function as center-attention 
management device, rather than as bearer of special denotational semantics, one 


24 candidate for one instance of the form ni in the document would be the written sequence 
(hainnicutthan) which appears at line 13 of page 163 in the manuscript. The interlinearized ver- 
sion can be consulted in Smailus (1975: 71, 158) who suggests a reading of the sequence as hain-i 
cut than, rather than hain ni cut than. This analysis would settle the sequence (hainni) as de- 
monstrative plus (deictic?) enclitic (haini) rather than demonstrative plus determiner (hain ni). 

?"Ihe complex interaction of deictic enclitics, focus markers and pronominal/demonstrative 
roots gives some room for slightly different proposals on diachronic developments. For exam- 
ple, Mora-Marín (2009: 121) claims that ha'in was used as an article in Proto-Ch'olan and that 
both the Proto-Western-Ch'olan and the Proto-Eastern-Ch'olan branches developed it further 
as definite article. A somewhat different proposal, which shows in more detail the complexity 
of the process, can be consulted in Becquey (2014: 392-422). The overview of such different 
proposals is beyond the scope of this study but suffices to say that in either case hini and ni 
are linked, either by both being directly derived from a common ancestor demonstrative/focus 
marker haini or by ni being a further reduction of hini. 
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may ask how advanced it is in the various grammaticalization paths from demon- 
stratives to article. I start by pointing characteristics that set ni apart from a 


demonstrative. 


4.1 Telling apart articles from demonstratives 


4.1.1 Frequency criteria 


Faced with a puzzle similar to mine, namely, how to assess the function of a cer- 
tain determiner in an under-described language, Cyr (1993) takes a small sample 
of languages to count the frequency of use of demonstratives and articles. She 
does so to propose the following frequency criterion as an auxiliary tool to assess 
the likelihood of a given particle of being an article in an undescribed language: 


[...] all the languages that have a definite article use it with more than 
39% but with fewer than 55% of the NPs. Moreover, in any language, the 
frequency in the use of a demonstrative determiner does not exceed 7.07% 
of the NPs. (Cyr 1993: 222) (Sample: Finnish, French, Italian, Cree, Swedish, 
Montagnais, German) 


I show in the Table 3 and Table 4 a similar count for Yokot'an, as established 
in the Frog Story narrative and the Two Fishingmen story: 


Table 3: Frequency of determiners in the Two Fishingmen story 


Lexical NP Bare Noun Indef. ni Poss. Dem. 
271 86 12 82 91 6 
100% 31% 4.395 29.6% 33% 2.1% 


Table 4: Frequency of determiners in the Frog story 


Lexical NP Bare Noun Indef. ni Poss. Dem. 


106 69 11 13 12 1 
100% 65.1% 10.4% 12.26% 113% 0.94% 


Quite clearly, on frequency figures and taking as guide the numbers from Cyr 
(1993), the determiner ni runs well below the expected article use frequency, but 
above the expected demonstrative use. 
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One may interpret this in two ways. In one of them the element counted is 
not really a completely developed article in the sense that its range of uses is still 
limited and leaves out many uses of more prototypical articles 28 In a different 
perspective one may consider the possibility that the element in question can 
be used in every way a prototypical article can, but competes with other formal 
resources in many of these contexts. Both alternatives would account for a lower 
frequency than expected regarding Cyr (1993)'s criteria. What should be noted, 
however, is that in a language where such article is optional in most contexts, 
the frequency figures can be subjected to great variation. 


4.1.2 Qualitative criteria: Anti-demonstrative contexts 


Since at any stage of its grammaticalization a definite article can preserve some 
distributions and functions from previous stages, it can share domains of use with 
demonstratives. However some of the new extended uses are less well suited for 
demonstratives and this is one of the clues that differentiates a definite article 
from its ancestor. One such use is the so-called larger situation use in which the 
article accompanies first mentions of entities that are considered to be identi- 
fiable by general knowledge of the world and culture (Himmelmann 2001). We 
have seen in (7) above that with globally unique entities as the sun, the use of ni 
is avoided. However, ni becomes more readily available with institutional roles. 
This is shown in examples (50) and (51) in a conversation where Alfonso (Alf-m) 
explains the role played by some of the specialists in the village. Thus, some con- 
crete cases are discussed, but many general statements are made which do not 
concern any particular individual but rather the role itself. Both (50) and (51) are 
generic statements not involving particular individuals. 


(50) Dos año [u-]num-e ni patron. 
two year ERG3-pass-IPFV DET patron 
(Alf-m): "Ihe patron lasts 2 years (in charge): 
[chf HPatron ALE 34 (01:20-01:22), Delgado-Galván 2018] 


The utterance (50) is part of a general characterization of the patron role - in 
fact (50) is a characterizing statement itself — and (51) is part of a general account 


?°Interestingly, Greenberg (1978: 62) considers an example from Bwamu (Niger-Congo family) 
of a "nascent article which is [...] at a point between a zero stage demonstrative and a Stage I 
definite article", but ultimately rejects it as a candidate for his Stage I article (definite article). 
One main factor that pushes him to exclude it from a Stage I status is Manessy's (1960: 93) 
report on the low discourse frequency and the optionality of its use. The exact same comment 
could be directed to the determiner ni of Yokot'an. 
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of the diseases provoked by the yumka’ob spirits, the “owners of the earth”, but 
it is not a characterizing statement. 


(51) Ora aj-t'ábüla mach une uy-äk’-e’ u-ba-une, peru duro 
now CLF.M-adult NEG PRO3 ERG3-heal-IPFV ERG3-REFL=PRO3 but hard 
chita tuba u-ts’äkäl-in ni yerbateru. 
also(?) PREP ERG3-cure-IPFV[ABS3] DET healer 
(Alf-m): ‘Now the adults don't heal, but it is hard as well for them to get 
cured by the healer’ 

[chf_HPatron_ALF_630-631_(28:44-28:50), Delgado-Galvan 2018] 


Reference to kinds is also a use where an article is better suited than a demon- 
strative. We can see two examples of kind-denotation to deer below. 


(52) Ida=ba pue ajn-i Een chimay. Ni chimay 
here-roP so belocated-Prv[ABs3] many deer DET deer 
u-ts'on-e-o' ni gente. 

ERG3-hunt-IPFV[ABS3]-PL DET people 
(Alb-m): “Here there was a lot of deer. The people hunt the deer: 
[chf HT ALB 72-73 (02:18-02:24), Delgado-Galván 2018] 


(53) I che’ che’chich xup-i ni chimay. Ma’ ni an 
andso of.course finish-Prv[ABs3] DET chimay NEG ADV EXIST[ABS3] 
bada=ba. 
now=TOP 


(Alb-m): ‘So, of course the deer is finished. There is no more now. 
[chf HT ALB 123 (04:29-04:31), Delgado-Galván 2018] 


Finally, the lack of deictic contrast of ni can be observed in (54), which is the 
closing line ofthe Pear Story narrative. A co-occurrence within the same NP of ni 
and the proximal demonstrative jinda is suggestive that ni no longer introduces a 
deictic contrast. For if nistill held the (distal) deictic value of its diachronic source 
jini, it should be incompatible with the proximal deictic value contributed by 
jinda.”’ Such loss of deictic contrast is one of the functional criteria to identify 
that a former demonstrative has undergone grammaticalization (Diessel 1999: 
118). 


"7 Knowles-Berry (1984: 208, 236) provides a sample of an NP in which a distal demonstrative jini 
shares a noun with a proximal deictic enclitic (da): jin-i winik-da. I have not found such NP 
types in my corpus and since no context is provided - not even sentential context - it is hard 
to assess this sample. 


186 


5 A nascent definiteness marker in Yokot'an Maya 


(54) Kama jin-i ni ts'aji jin-da. 
Q DEM-DIST DET chat DEM-PROX 
(Esm-f): “That is how this story is. [chf PS ESM 068 (03:44-03:46), 
Delgado-Galván 2018] 


Notice, additionally, that ni can no longer inflect for deictic distance, as jini can: 
jin-i/jin-da, which is also a (morphological) criterion in Diessel (1999: 118). Clearly, 
then, the form ni is not just a phonological reduction of jini, it constitutes a new 
element which is located somewhere in the grammaticalization path to turn into 
a different marker. It is time now to compare the different uses of ni against the 
background of the paths proposed for the development of articles. 


4.2 Grammaticalization path and stages 


I will now assess the determiner ni against the grammaticalization stages of a 
definite article as presented by Greenberg (1978: 61-74) and Hawkins (2004: 84- 
86), which are presented schematically in Table 5 and Table 6. These illustrate the 
paths of development from a demonstrative source, other sources are not of inter- 
est here. Greenberg (1978) proposes a grammaticalization scheme in three steps 
for the definite article, Stage 0, Stage I early and Stage I late. Hawkins (2004) goes 
more into detail and proposes four logical steps of development for definite arti- 
cles, but on the other hand he will not consider as definite article any determiner 
that still conveys deictic contrast. Thus, Stage 0 of Hawkins encompasses Green- 
berg's stages 0 and I early (since deictic contrast still operates), while Greenberg's 
stage I late is split into stages 1-2-3 of Hawkins (2004). 


Table 5: Article grammaticalization stages (Greenberg 1978) 


Stage 0 =  Stagel(early + late) = Stage II = Stage III 
nominality marker 
demonstrative => definite article = specific article 
gender marker 
A Gg specific but . o. 
pure exophoric deixis = identified in general = unidentified ^" ‘8 of nominality 


In the coarsest scheme (Greenberg 1978), the main functionality of the deter- 
miner ni can be located in between Stage 0 and Stage I. Greenberg’s Stage II 
(corresponding to Hawkins Stage 4) and Stage III (not represented in Hawkins 
2004) have marginal relevance here as uses of ni related to specificity or nominal- 
ity may only appear in restricted contexts (negative existential constructions and 
some syntactically nominalized clauses (Becquey 2014: 397, 408), respectively). 
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Grammaticalization paths as presented in Table 5 and Table 6 are not to be 
taken as linear developments, but rather as logical steps that can be taken at 
different times or simultaneously in different pragmatic and constructional con- 
texts. This means that the same form easily assumes different uses according to 
individual constructions. A case in point, to be presented in 84.3, is the negative 
existential construction which shelters a specialized use of ni which has more in 
common with situational uses as in the example (9) in which the NP concerned 
is not necessarily involved in the evolution of an anaphoric/topical chain. This 
lack of linearity is what leads to fragmented uses of a definite marker (see Lyons 
1999: 159) which is also seen in the fact that a definite article in an early stage can 
already show characteristics of even the latest stages, but in restricted contexts. 

The initial stage (Stage 0 in all authors) corresponds to a demonstrative, whose 
function is to perform situational or exophoric reference and introduces a deictic 
contrast with other deictic forms. Generally, it is the third-person/distal proxim- 
ity deictic element from the paradigm that gives rise to the grammaticalization of 
an article. This is no exception in Yokot'an, as it is indeed the distal demonstrative 
jini that provides the base for ni. The exophoric/contrastive nature of the initial 
demonstrative base makes it incompatible with a generic interpretation. Clearly, 
then, as shown in the example (53) above, the form ni is beyond the initial stage 
(Stage 0). 

The initial step of development towards an article extends the use of the demon- 
strative to also encompass endophoric reference, as an anaphoric (or cataphoric) 
device. This secondary use of the demonstrative as anaphoric device is shown, 
for Yokot'an jini, in example (55) from the Pear Story narrative. After a digression 
describing how a boy passed with a goat near the baskets of pears, the narrative 
once more returns to what the pear-collecting man is doing. The reference to him 
is then resumed with an anaphoric definite NP, with jini. 


(55) De ya-i  yok winik jin-i-'a t'üb-i cha’-num 
PREP SD-DIST little man DEM-DIST=TOP ascend-PFV[ABS3] two-NUM.CLF 
tan te. 

PREP tree 


(Esm-f): “Then the man (that has been mentioned) climbed again in the 
tree. ' [chf PS ESM 016 (01:16.5-01:18.5), Delgado-Galván 2018] 


Such endophoric function may turn into the main or sole use of the demon- 
strative in its way towards developing into an article (Stage I early in Greenberg, 
but still Stage 0 in Hawkins as long as deixis is not dropped). At the next stage 
(Stage I late in Greenberg, Stage 1in Hawkins), the identifiability of the referent is 
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assessed with respect to the whole visible situation or the whole previous text in 
memory, not just the recent text or some deictically selected subsituation. Identi- 
fiability is expanded to both textual and situational assessment and therefore the 
article use is restricted to anaphoric reference or to the immediate situation (for 
an immediate situation use of ni, consider that its insertion is indeed possible in 
an example as 9 above). 

A further development is the expansion of the contexts (or "pragmatic set") 
within which uniqueness is assessed to also consider non-visible and/or larger 
situations (Stage 2 in Hawkins 2004, Stage I Late in Greenberg 1978). The asso- 
ciation of reference gets extended from anaphoric to general-knowledge infer- 
ences, and stereotypic frames. We have seen that although ni has not extended 
to be naturally accepted with entities like the sun, the moon, etc. (see 7), it is com- 
mon with institutionalized roles (50) or in relation to some stereotypic frame. 
Finally, a definite article reaches Hawkins's (2004: 85) Stage 3 when its use ex- 
pands to unanchored uniqueness and generalizes to inclusiveness (i.e. a sort of 
plural uniqueness, the maximality of a group). At this point, generic reference is 
a suitable context for the article. 

With such development path as a background, it can be observed that the 
determiner ni exhibits compatibility with some of the uses in Hawkins' Stages 
1-2-3 (immediate situation-use, institutionalized roles, kind denoting). However, 
I wish to argue that the main function characteristic of ni is still at the transition 
between Stage 0 and Greenberg's Stage I or Hawkins' Stage 1. To see this, con- 
sider the following quote from Heine & Kuteva who, based upon Diessel (1999: 
96, 128-129), explain: 


Since the adnominal anaphoric demonstrative serves a discourse internal 
function - to refer to the same referent as its antecedent and thus track 
participants of the preceding discourse - it serves as a common strategy to 
establish major participants in the universe of discourse. Its use involves 
non-topical antecedents that tend to be somewhat unexpected, contrastive, 
or emphatic. At a next stage of development, the adnominal anaphoric 
demonstrative becomes a definite article, whereby its use is gradually ex- 
tended from non-topical antecedents to all kinds of referents in the preced- 
ing discourse. (Heine & Kuteva 2006: 101-102, emphasis mine) 


It is interesting to contrast this report with the one pictured by Givón (2001: 


474), which I quoted earlier: "Grammaticalized definite markers [...] arise first to 
mark topical definites.” 
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At first, there seems to be a contradiction. Yet there isn't. By joining the ob- 
servations in both quotes we can see that an attentional transition underlies the 
reported facts: non-topical antecedent and topical resumptive NP. Think of the 
antecedent as a FORWARD-LOOKING CENTER. Think of its later “unexpectedness” 
as reflecting the fact that it is not currently set as a PREFERRED CENTER (or, per- 
haps, not even as a Cb). Think of the "topical" resumptive NP as a BACKWARD- 
LOOKING CENTER and/or as a PREFERRED CENTER. Now we see that what seemed 
a contradiction hints at the specialization of an early definite of the kind found 
in Yokot'an. The rationale of its use is not to flag anaphoric NPs with non-topical 
antecedents or to flag topical anaphoric NPs, rather it is to mark the attentional 
transition itself.?? A topic shift can be decomposed in two steps or components, 
according to the Centering Theory model. One step is to announce or prepare 
an incoming shift by setting Cp # Cb (RETAIN transition). The second step is to 
execute such shift by setting Cb # previous Cb (SMOOTH-SHIFT transition). Both 
moves can be collapsed into a single move (ROUGH-SHIFT, and ZERO as special 
case). From Figure 3 above, it seems that ni can flag both types of transition (and 
the one containing both moves). Given the preference of more cohesive transi- 
tions over increasingly less cohesive ones (Figure 2), however, one can expect ni 
to be more systematically used to flag the least cohesive transitions: ZERO and 
ROUGH-SHIFT. In fact, the condition Cp # Cb across transitions covers most of 
the discourse-related cases I have illustrated in the present paper.?? 

Heine & Kuteva (2006) associate this particular function of flagging NPs which 
anaphorically evoke unexpected/non-topical entities with a stage previous to the 
demonstrative being a definite article. Givón (2001), on the other hand, associates 
the function of flagging topical NPs with an early definite article. Under this view, 
the determiner ni is better characterized as an early definite article, one that 
has not even reached Hawkins's (2004) Stage 1. Given Diessel's (1999) scheme of 
definite article grammaticalization (Figure 5) ni would be an anaphoric demon- 
strative specialized in anaphorically picking up non-topical referents and turn 
them topical (expectedly or not, i.e. with or without warning), while the original 
demonstrative source jini still holds a purely distal-anaphoric function. 

Since the main focus of definiteness studies has been the Hearer-status and 
how it may grammaticalize, the other possible functions of an article, related 
to Discourse-status, have received less attention. In the above descriptions of 


*8The reader should be aware, however, that I am here jumping from informal notions of 
topical/non-topical to technical and very particular notions of "topical" vs “non-topical”, as 
embodied by the notions of centers within Centering Theory. Yet I think the jump is enlight- 
ening for languages like Yokot'an. 

®Further investigation would be needed, but these results are already very suggestive. 
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exophoric demonstrative — anaphoric demonstrative — definite article 


Figure 5: Diessel's (1999) scheme of definite article grammaticalization 


grammaticalization paths of the article, the Discourse-status role appears as in- 
cidental, more as an introduction context than as a main function that can be 
fulfilled by the article. It does not explicitly appear in Table 5 or Table 6. In many 
languages this particular path of evolution of articles via the Discourse-status 
might be more relevant to understand their synchronic use. Not only it provides 
a starting point to understand the distribution of an otherwise unsystematic ar- 
ticle, but it also explains its lower frequency and its relative optionality. While 
ni has extended to being definite (in terms of the generality of contexts in which 
it can appear), it is still the initial specialized function of discourse-management 
of transitions which prompts its minimal occurrences. Since ni developed from 
a demonstrative and its main function is related to topicality while having lost 
any deictic value, one may wonder if it should not be regarded as a purely prag- 
matic marker of topicality issued from a demonstrative, in similar fashion to 
topic-markers in a selection of Papuan languages (de Vries 1995). Firstly, ni is 
restricted to the noun phrase, while its competitor, the topic-marker ba is not 
restricted in this way (neither are the corresponding Papuan examples of topical 
markers in de Vries 1995). Furthermore, in some specialized contexts, one can see 
ni inserted to convey features like specificity/referentiality, akin to more proto- 
typical definite articles. This is what I will illustrate in the following section. 


4.3 From topicality to specific referentiality marker: Special contexts 


Articles often have, as the most abstract function, the function to guarantee the 
syntactic nominality of the expression they modify. In the most syntacticized 
way, this means literally creating an argument from what otherwise would be 
interpreted as a predicate and unable to occupy argument positions (Gillon 2015: 
176). Such syntactic contrast may evolve initially from a more semantic contrast 
that opposes noun phrases interpreted as referring to specific entities against 
other noun phrases interpreted as not referring. Examples in (56) illustrate how 
special contexts can trigger a use of ni where its discourse-salience function is 
exploited to force (specific) referentiality. Matilde (Mat-f) is telling the story of 
how she got married and moved with her mother-in-law. While she was happy 
as to how her mother-in-law treated her, she points out an unpleasant surprise 
in line (56b): while the kitchen has an electrical grinder now, such grinder was 
not there when she moved in, she had to grind manually with a grinding stone. 
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(56) a Kol-on kd-nojna’. Kol-on chich dok une, ti’i 
leave-ABs1 ERG1-mother.in.law leave-ABs1 always com PRO3 well 
u-k'ajalin täkä. 

ERG3-thought also 
(Mat-f): ‘I stayed with my mother in law. I stayed always with her, 
she treats me very well’ 
[chf CONV MAT 501-502 (19:30-19:36), Delgado-Galván 2018] 
b. Peru mach ajn-i ni molino une. 
but NEG be.located-prv[ABs3] DET grinder PRO3 
(Mat-f): ‘But the grinder was not [there]? 
[chf CONV. MAT. 504. (19:36-19:38.5), Delgado-Galván 2018] 


The crucial point is that while on positive polarity a bare noun is generally 
enough to be referential, the negative polarity in existential context forces the 
speaker to call in the assistance of ni for the noun to unambiguously refer (56b). 
The contrast is displayed below for more clarity: with ni, in the example (57a) 
the negative context translates as negating the location of some referred object. 
But without ni, in example (57b), the negative context is readily interpreted as 
negating the existence of an object, especially when - as in this case - there was 
no previous mention of the object in the conversation. 


(57) a. Mach ajn-i ni molino une. 
NEG be.located-prv[ABs3] DET grinder PRO3 
(Mat-f): "The grinder was not [there] 
[chf CONV. MAT. 504. (19:36-19:38.5), Delgado-Galván 2018] 
b. Mach ajn-i molino une. 
NEG be.located-prv[ABs3] grinder PRo3 


(Esm-f): “There was no grinder.’ [My. elicitation] 
c. Ma’ ajn-i baile une. 
NEG be.located-prv[ABs3] dance ppo? 


(Mat-f): “There was no dance! 
[chf CONV. MAT. 474. (18:31.5-18:32.5), Delgado-Galván 2018] 


Obviously, when what matters is the type of object rather than some specific 
instance, no ni is likely to be found, like in (58), where Matilde is explaining 
that you would feed small chicken with maize dough when no (industrialized) 
animal-food is available:?? 


?? A detail that the reader might observe is that Yokot'an has two existential verbs (in the sense 
of being used in such constructions): an, glossed Exist, which does not inflect for TAM and 
ajne which inflects for TAM. Since an is a non-verbal predicate unable to take TAM inflection, 
ajne is used instead in all “tensed” existential constructions. 
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(58 I une, xix a-b-en une, mach an 
and PRO3 maize.dough ERG2-give-IPFV[ABS3] PRO3, NEG EXIST[ABS3] 
alimento, yok xix a-b-en une. 
industrial.animal.food, small maize.dough ERG2-give-IPFV[ABS3] PRO3 


(Mat-f): 'And those, you give maize dough, there is no food, (so) you give 
them small maize dough: 
[chf CONV. MAT. 150-151 (04:41.7-04:48), Delgado-Galván 2018] 


The presence of ni in this negative context would again be interpreted as sug- 
gesting an interpretation of mach'an as the negation of a location rather than of 
existence (like: ‘the food is not there’). It is precisely in these specialized contexts, 
negative existential constructions, in which ni gets associated to specific refer- 
ence interpretation, since in most other contexts, specific referentiality is tied to 
nouns themselves as default, ni simply flagging a switch in attention regarding 
the flow of discourse. However this marginal use, along with its inability to ap- 
pear outside NPs, helps to consider the determiner ni in the category of definite 
articles rather than in the category of topic markers. 


5 Concluding remarks 


I have examined Yokot'an's candidate for a definite determiner, the marker ni. 
In trying to unravel the basis of its use, I attacked the problem from two sides. I 
started with a synchronic textual-analysis perspective. In those texts with mini- 
mal occurrences, I isolated a discourse pattern for the presence of ni using Cen- 
tering Theory as a heuristic tool. On the other hand, I also used a diachronic 
perspective in which I projected some of the attested possible uses of ni into the 
grammaticalization paths proposed in the literature for the development of def- 
inite articles from demonstratives. A general observation that guided this study 
is the relatively low frequency and relative optionality of this particle. In this 
sense, I used counting/distributional criteria regarding its frequency and its op- 
tionality as compared to cross-linguistic expectations in order to determine that 
a pragmatic/discourse-based explanation was called for and to show, with help 
of more qualitative clues, that ni was beyond the grammaticalization Stage 0 as- 
sociated to the demonstrative source. 

Iconclude that niis more a discourse salience-oriented than a reference-orient- 
ed resource in the sense that its likelihood to be used has more to do with atten- 
tional transition types than with identifiability properties of the NP involved. 
Such orientation and the overlap in function of many different linguistic re- 
sources also allows more stylistic variation among speakers' use. Such variation 
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accounts for the fact that the low frequency does not necessarily correlate to an 
article with a span of uses that are limited to early stages of grammaticalization. 
The optionality of an article and lower frequency are in principle independent of 
the degree of development regarding the span of possible uses an article bears 
(as already suggested in general by Dryer 2014). 

Both lower frequencies and relative optionality of the definite determiner in 
Yokot'an are a more direct reflection of multiple resources in the languages over- 
lapping on similar functional domains than a reflection of only the intrinsic de- 
velopment of the article. For example, in certain contexts (as negative existential 
statements) it can be used to indicate referentiality/specificity, but in the overall 
system, (bare) noun phrases do so by themselves as default. Similarly, while the 
main function of ni is to flag attentional transitions (local topicality switches), 
in some contexts it can be complemented or replaced in this role by the topic- 
marker ba, which has different distributional restrictions. In other words, low- 
frequency of use in early definite determiners can have several independent ex- 
planations: nouns have not yet lost their capacity to be interpreted definitely 
(bareness is not interpreted as indefiniteness) and the nascent definite determiner 
might be competing with other discourse-salience markers. 

Finally, the discourse-management basis of the "definiteness" underlying the 
use of ni explains why a language that does not need definite markers (since its 
bare nouns are generally self-sufficient in this respect) would still have them. 

The logical orthogonality of two different notions, Hearer-status (identifiabil- 
ity) and Discourse-status (discourse-salience), and the possibility for the speakers 
of a language to articulate the use of a determiner around one rather than the 
other notion shows that to have different theories of definiteness is more inter- 
esting empirically than to reduce definiteness to a single notion that attempts to 
cover by generalization all the instances. 

The higher frequency of ni in written texts and in some idiolects has undeni- 
able relation to contact with Spanish, and at this point it is relevant to note that 
contact-induced change has also been blamed for the generalized spread of article 
systems in European languages (Schroeder 2006), which makes Mesoamerican 
languages a good opportunity for the study of a similar but ongoing contact- 
induced change. 
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Sources 


Unless stated otherwise (e.g. with a label like “my elicitation" or with a refer- 
ence to relevant literature), all the materials used in this study are from the 
Yokot'an Space Grammar/The Oral Literature of the Endangered Cultural Practices 
of Yokot'an Pilgrimages Project, lead by Amanda Alejandra Delgado Galván and to 
which I contributed as data collector assistant. I hereby acknowledge her kind- 
ness for granting me permission to use them. These materials are archived in 
the donated archives section of the Language Archive of the MPI. The “Yokot’an / 
Chontal de Tabasco" section from The Language Archive can be found at the fol- 
lowing url address: https://hdl.handle.net/1839/00-0000-0000-001E-8B97-0. As I 
consider important to endorse The Austin Principles of Data Citation in Linguis- 
tics, I have also directly referred to this archive and to its main collector/curator 
in the examples and in the References section. Mainly eight texts from the area 
of Nacajuca municipality were consulted, with varying depths of (re-)analysis, 
for the present study (Table 7). 


Table 7: Texts consulted 


File name Settlement App. recording Type of speech event 
length 

chf HT ALB Tucta 29 min Interview/conversation 

chf CONV FEL Tucta 13 min Interview/conversation 

cht TwoFishingMen LUC Mazateupa 21 min Story narration 

chf MG CAR Tapotzingo 15 min Match-path task 

chf HP ALE San Isidro 44 min Interview/conversation 

chf CONV MAT San Isidro 23 min Interview/conversation 

chf HS BLA San Isidro 5 min Hunting Story task for 

co-motion events 
chf FS ESM San Isidro 6 min Frog Story task 
chf PS ESM San Isidro 4 min Pear Story task 
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Abbreviations 


In every example from the archive in Delgado-Galván (2018) a code for the speak- 
er identity and gender is indicated in the translation line. For example: (Fel-m) 
refers to a man (m:- masculine gender) and (Arg-f) to a woman (f:- feminine 
gender). The first set of numbers after the filename of the recording refer to the 
line numbers in the ELAN-Flex file, the second set of numbers refer to the time 
interval. The list of abbreviations used in the examples is the following: 


EXIST existential PRO pronoun 

FIL filler in conversation SD source deictic 
NUM numeral 

PREP  preposition 
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Chapter 6 


Definiteness across languages and in L2 
acquisition 


Bert Le Bruyn 
Utrecht Institute of Linguistics OTS 


This paper presents evidence suggesting that article-less languages are not created 
equal and that this influences how native speakers of these languages acquire arti- 
cle languages like English. The evidence suggests that Mandarin learners of English 
do not unequivocally bear out the predictions of the Fluctuation Hypothesis, un- 
like learners of English with e.g. Korean, Russian and Japanese as an L1. I propose 
a research program that approaches articles as a syntax/semantics interface phe- 
nomenon. The program considers the syntax/semantics interface of definiteness in 
its entirety and makes no a priori assumptions about how it is best analysed. Rather, 
it adopts a data-driven comparative approach with multiple L1s that allows to give 
a fine-grained answer to the question how L1 influence plays out for definiteness. 


1 Introduction 


The L2 acquisition of the definite article has already played an important role in 
the debate on L1 influence. It is one of the morphemes that - according to the 
original morpheme studies (e.g. Dulay & Burt 1974) — is acquired by all L2 learners 
at the same time. The work of Ionin and colleagues (e.g. Ionin et al. 2004; Ionin 
& Montrul 2010) has however shown that L1 influence distinguishes between 
learners with an L1 that has articles and those with an article-less L1. I argue that 
the time has come to probe further and look into whether L1 influence is identical 
for alllearners with an article-less L1. 

I briefly sketch the SLA literature on L2 article acquisition by learners with an 
article-less L1 (82), argue that L1 influence from article-less L1s is not uniform (83, 
$4), and propose a research program that allows us to investigate this in detail 


(85). 


Bert Le Bruyn. 2019. Definiteness across languages and in L2 acquisition. In 
Ana Aguilar-Guevara, Julia Pozas Loyo & Violeta Vázquez-Rojas Maldonado 
| (eds.), Definiteness across languages, 201-219. Berlin: Language Science Press. 
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2 From an article-less L1 to an article L2 


Research on the second language acquisition of definite articles by L1 speakers 
of article-less languages dates back at least four decades (see e.g. Hakuta 1976). 
Early studies (Huebner 1983; Tarone & Parrish 1988; Thomas 1989) used the ty- 
pology of definite/indefinite contexts proposed by Bickerton (1981) to analyze 
the production of L2 learners. This typology is based on two binary features, viz. 
"speaker reference" [+/-SR] and “hearer knowledge" [+/-HK]. The outcomes of 
these studies were mixed, e.g. Thomas (1989) argues that L2 learners associate the 
definite with the feature [+SR] whereas Master (1987) argues that they associate 
it with the feature [+HK], thus leading to significantly different predictions. 

In the early years of this century, an experimental paradigm came up that 
singled out one specific subtype of [+SR; -HK] contexts. Ionin (2003) initiated 
this paradigm and hypothesized that the problems that pop up in [+SR; -HK] 
contexts are primarily due to the fact that learners confuse specificity and defi- 
niteness. Specificity in this paradigm is defined as the speaker's intention to refer 
to a unique and noteworthy individual in the set denoted by the NP (Ionin et al. 
2004). (1) presents an item with a specific referent (a very important client from 
Seattle) while (2) presents an item with a non-specific referent (a student): 


(1) Specific referent 
Jennifer: Hello, Helen? This is Jennifer! 
Helen: Hi Jennifer! It’s wonderful to hear from you. I suppose you want to 
talk to my sister? 
Jennifer: Yes, I haven’t spoken to her in years! 
Helen: I’m very sorry, but she doesn’t have time to talk right now. She is 
meeting with a very important client from Seattle. He is quite rich, and 
she really wants to get his business for our company. (Ko et al. 2010: 239) 


(2) Non-specific referent 
Context: At a university. 
Professor Clark: I’m looking for Professor Anne Peterson. 
Secretary: I’m afraid she is busy. She has office hours right now. 
Professor Clark: What is she doing? 
Secretary: She is meeting with a student, but I don't know who it is. (Ionin 
et al. 2004: 68)! 


'T provide examples taken from Ko et al. (2010) and Ionin et al. (2004). These represent the most 
recent instance of Ionin's (2003) paradigm by Ionin and colleagues. 
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Specificity in (1) is operationalized by having Helen add insider details about 
the referent in the form of modifiers and a follow-up sentence. These details sug- 
gest that Helen has a unique client in mind who is furthermore noteworthy. The 
operationalization of non-specificity in (2) lies in the absence of additional in- 
formation about the referent and the explicit statement of the lack of knowledge 
about his/her identity. Ionin et al. (2004) show how Korean and Russian L2 learn- 
ers of English who are asked to choose between a, the or ø as a determiner for 
very important client and student are more likely to choose the for the former 
than for the latter.? 

Ionin's paradigm has generated consistent results in a number of replication 
studies involving L2 learners of English with an article-less L1 (e.g. Ko et al. 2010 
for Russian and Korean; Hawkins et al. 2006 for Japanese). On the most recent 
interpretation of the data the paradigm has generated (Ionin et al. 2009), the 
problem L2 learners face is that article systems cross-linguistically come in two 
varieties, one organized around definiteness, the other around specificity and 
definiteness. English represents the former (Table 1), Samoan the latter (Table 2). 


Table 1: The English article system 


* definite -definite 


+specific the 
-specific the 


Table 2: The Samoan article system (Ionin et al. 2009) 


+definite -definite 
+specific le le 
-specific le se 


"in this paper, I do not separately report on native speaker controls but refer to Ionin et al. 
(2004) and Le Bruyn & Dong (2017a,b) for the relevant data. Native speakers in these studies 
performed at ceiling on providing indefinite articles both in the indefinite specific and indefi- 
nite non-specific condition. 
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The difference between the two systems lies in the fact that the Samoan “defi- 
nite" article is also used for specific indefinites. Ionin and colleagues hypothesize 
that L2 learners need to determine which of the two article systems applies in the 
languages they are learning, leading them to fluctuate between the two systems 
and sometimes overproduce definite articles in specific indefinite contexts. This 
hypothesis is known as the Fluctuation Hypothesis and is the most influential 
theory-driven explanation about L2 definite article acquisition to date. 


3 Evidence against the Fluctuation Hypothesis? 


This section is named after Snape et al. (2006), a paper that brings together three 
independently carried out replication studies of Ionin et al. (2004) and finds that 
Japanese learners of English nicely follow the predictions of the Fluctuation Hy- 
pothesis but that Mandarin learners of English do not. 


3.1 Snape et al. (2006) 


The data from the Japanese learners which Snape et al. (2006) report on comes 
from Hawkins et al. (2006) and Reid et al. (2006). Tables 3 and 4 provide a sum- 
mary of the data, focusing on the two contexts that allow us to check the predic- 
tions of the Fluctuation Hypothesis: specific indefinite contexts and non-specific 
indefinite contexts.’ 


Table 3: Percentage of the responses by 12 Japanese respondents in the 
specific and non-specific conditions in Hawkins et al. (2006) 


the 
Non-specific 8% (4/48) 
Specific 50% (24/48) 


Table 4: Percentage of a and the responses by 14 Japanese respondents 
in the specific and non-specific conditions in Reid et al. (2006) 


a the 


Non-specific 94% 497 
Specific 70% 29% 


?To present the cleanest possible picture, I restrict myself to data from experimental items with- 
out scopal interactions and data that focus - as in Ionin's original experiment - on the singular. 
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Even though some details about the studies are not available and I cannot 
report the data fully in parallel, the general picture is clear: Japanese learners 
appear to be sensitive to specificity and their production of English articles bears 
out the predictions of the Fluctuation Hypothesis. Both in Hawkins et al. (2006) 
and Reid et al. (2006), Japanese learners overproduce definites in the specific 
indefinite condition but not in the non-specific indefinite condition. 

The data from Mandarin learners that Snape et al. (2006) report on are taken 
from Ting (2005). They are summarized in Table 5. 


Table 5: Percentage of a and the responses by 8 Mandarin respondents 
in the specific and non-specific conditions in Ting (2005) 


d the 


Non-specific 94% 4% 
Specific 92% 3% 


The contrast between the Mandarin and the Japanese learners is striking: while 
Japanese learners overproduce definites in 29 to 50% of specific indefinite con- 
texts, Mandarin learners seem to behave like native speakers in only overproduc- 
ing definites in 3% of the same contexts. 

Snape et al. (2006) conjecture that the contrast between Mandarin and Japa- 
nese learners might be explained by the fact that Mandarin is in a more advanced 
stage of developing an article system parallel to that of English. It would be gram- 
maticalizing the numeral yi (‘one’) as the indefinite and the demonstrative nei 
(‘that’) as the definite. L1 transfer could then explain why Mandarin learners per- 
form more native-like. 


3.2 Assessing data and analyses 


Let us - for the moment - take the data of Ting's study at face value. What 
they indicate then is that there is L1 influence. Whether Snape et al.'s conjecture 
is on the right track or even plausible is however impossible to tell. The study 
falls short of providing sufficient motivation at two levels: (i) it does not provide 
any comparative data that would support the difference in grammaticalization 
between Mandarin and Japanese, (ii) it provides no systematic way of linking the 
alleged difference to the performance of L2 learners. 

In the remainder of this paper, I will do two things. The first is to provide data 
from two further small-scale studies that lend support to the idea that Mandarin 


205 


Bert Le Bruyn 


learners of English do not unequivocally bear out the predictions of the Fluctu- 
ation Hypothesis ($4). The second is to present a new methodology that allows 
us to systematically study L1 influence in acquisition ($5). 


4 Mandarin learners and the Fluctuation Hypothesis 


Snape et al.'s study is not the only one that has looked into the predictions of the 
Fluctuation Hypothesis for Mandarin learners. Trenkic (2008) did the same and 
- unlike Snape et al. - found that Mandarin learners overproduce definites in 
Ionin et al.'s (2004) specific contexts. In this section, I present two small follow- 
up studies that seem to pattern more with the data from Snape et al. (2006). The 
conclusion I draw is that Mandarin learners of English do not unequivocally bear 
out the predictions of the Fluctuation Hypothesis. 


4.1 Replicating Ting's null result 


The replication of a null result might seem like an irrelevant exercise, but given 
the small sample size of Ting's original study, I think itis a worthwhile enterprise 
to convince us that Mandarin learners are likely to be different from learners with 
other article-less Lis, 

I report here on an experiment I conducted with 35 second-year students of 
the Zhejiang Ruian High School. I selected this population rather than university 
students in or outside of China to make sure that their general proficiency was 
unlikely to be higher than that of the Japanese learners Hawkins et al. (2006) and 
Reid et al. (2006) report on. Their ages matched the year they were in (16 and 17) 
and none of them had spent time abroad or was proficient in an article language 
other than English. 

I recycled 4 specific indefinite and 4 non-specific indefinite items from Ionin 
et al. (2004)? I furthermore added 36 fillers (partly recycled, partly invented), 
balancing the anticipated a and the responses. 


*Trenkic (2008) however does not agree with the interpretation of the data. See Trenkic (2008) 
and Ionin et al. (2009) for discussion. 

>The specific indefinite items we used were items 25, 26, 27 and 28 from Ionin et al. (2004). For 
the non-specific indefinite items, I used items 37, 38, 39 and 40. These non-specific items were 
control items in the original study but do not contain the explicit statement of lack of speaker 
knowledge criticized in Trenkic (2008). Ionin et al. (2009) indicate that this explicit statement of 
lack of speaker knowledge is not a crucial part of the operationalization of non-specificity and 
Ionin et al. (2004) found that their indefinite control items pattern with non-specific indefinite 
items: there is a significant difference in the responses with the specific indefinite test items 
(p«0.001) but not with the non-specific indefinite test items. 
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The items were semi-randomized and presented as a paper and pencil forced- 
choice elicitation task that was followed by a language proficiency test with the 
same format. Participation was framed in a classroom setting. As in Ionin et al.'s 
original study, each item of the experiment came with a blank and three options 
to choose from: a, the or g. There was no time limit but students all finished the 
experiment and proficiency test within 45 minutes. 

The proficiency test was not designed to classify the level of the learners based 
on standardized levels like those of the CEFR but to allow for a relative compar- 
ison between the subjects of the current experiment and those of three parallel 
experiments probing the role of modification. As such, the results are less rele- 
vant to the current study and I will consequently restrict myself to reporting the 
results of the experiment itself. 

Table 6 presents the descriptive results of the study in parallel with the data 
in Tables 3-5. 


Table 6: Percentages and absolute frequencies of a, the and g responses 
by 35 Mandarin respondents 


a the ø 
Non-specific 84% (118/140) 9% (13/140) 6% (9/140) 
Specific 88% (123/140) 9% (13/140) 3% (4/140) 


The data of the non-specific and the specific condition are almost fully parallel. 
I ran a mixed effects model with item and participant as random factors. Given 
that the selection of ø gives no insight into whether subjects consider the item 
indefinite or definite, I modeled these responses as missing data. As expected, 
there was no overall effect of condition and pairwise comparisons showed no 
difference between the two conditions (F(1, 165) = 0.002, p = 0.963). 

I interpret the data in Table 6 as indicating that Mandarin L2 learners are un- 
likely to be sensitive to specificity in the way it is operationalized by Ionin et al. 
(2004). As I indicated before, I am aware of the fact that few to no conclusions 
can be drawn on the basis of a null result, but I did consider it relevant to at least 
check whether the null result found in Ting (2005) is not merely due to its small 
sample size. 
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4.2 Changing paradigms 


Le Bruyn & Dong (20172) designed an alternative paradigm to check the predic- 
tions of the Fluctuation Hypothesis. The results reported in Le Bruyn & Dong 
(2017b) indicate that Mandarin learners behave exactly opposite to the predic- 
tions of the Fluctuation Hypothesis. 

The paradigm of Le Bruyn & Dong has two experimental conditions: a specific 
indefinite condition and a non-specific indefinite condition. To operationalize in- 
definiteness, we used DPs whose semantic content does not guarantee unique- 
ness and whose referents are non-familiar. This choice was inspired by the fact 
that DPs whose semantic content guarantees uniqueness involve nouns and ad- 
jectives (like superlatives) that typically occur with a definite. Using these nouns 
would make it hard to distinguish grammatical from collocational knowledge. 

To operationalize specificity and non-specificity, we did not resort to adding 
or leaving out insider details. Rather, we presented specific referents as notewor- 
thy by turning them into the protagonists of a story and presented non-specific 
referents as non-noteworthy by turning them into secondary characters: 


(3) Have I already told you about the scariest moment of my life? Well, one day 
Isaw a girl on top of a building... All of a sudden, she starts to dance, slips 
on a brick and falls off the building! Fortunately she landed on some 
cardboard boxes and didn't get hurt... 


The girl is the protagonist in (3): after her introduction, she is immediately 
picked up as the subject of the next sentence and she remains the main character 
of the story throughout. The brick is a secondary character: it is introduced but 
never referred back to. We made 8 stories following the setup of the one in (3): 
(i) introduction of the protagonist, (ii) story about actions of the protagonist, (iii) 
optional introduction of a secondary inanimate character, (iv) continuation of 
the story about the protagonist. A further 8 stories were created as fillers and 
had a freer structure. 

We adopted the forced choice setup used in Ionin et al.'s specificity paradigm 
but limited the answer possibilities to the definite and the indefinite article. For 
the experimental items, an article had to be selected for the DP introducing the 
protagonist (four items) or the secondary character (four items), thus leading to 
our two experimental conditions. For the fillers, the relevant DPs concerned col- 
locationally and/or grammatically enforced definites (four items) and indefinites 
(four items): 


*To keep the processing cost ofthe task as low as possible we decided not to increase the number 


of fillers beyond 8. 
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(4) You'll never guess what happened today! I've seen a woodpecker for the 
first time in my life. 


(5) You'll never guess what happened today! You know I have a sister? Well, 
she came to visit me for the first time in 10 years! 


Wherever possible, the similarity between the stories across the two condi- 
tions was maximized. We took care, however, to create sufficient variation to 
prevent subjects from inferring answers. One way of doing so was to use the 
possessive my daughter in the experimental item based on (3) when asking par- 
ticipants to fill in the blank for the backgrounded character. 

To create a communicative context for the stories, we inserted them in a pub 
context in which one character tells them to another. This was done pictorially 
as in Figure 1 and Figure 2. 


Have | already told you about the scariest moment of my life? 
Well, one day my daughter was standing on top of a 
building... All of a sudden, she starts to dance, slips on a|the 
brick and falls off the building! Fortunately she landed on 
some cardboard boxes and didn't get hurt... 


Option 1: a brick 


Option 2: the brick 


Figure 1: Example of a non-specific/backgrounded item 


The participants were 22 L1 Mandarin/L2 English speakers. All were under- 
graduate students of English at the Beijing International Studies University. The 
test was administered by a student assistant in a quiet environment at the uni- 
versity. Participants were tested individually. The instructions as well as the 16 
semi-randomized test stories (8 experimental items and 8 fillers) were presented 
in a PowerPoint presentation with one slide for the instructions and one slide for 
each test story. Participants were asked to indicate for each story whether they 
preferred the version with the indefinite (Option 1) or the definite article (Option 
2). A small language biography survey was orally carried out by the student as- 
sistant to check for the potential influence of stays abroad or of other languages. 
No student had spent time in an English-speaking country or mastered an article 
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Have | already told you my favorite burglar story? Well, one 
day | saw this|a guy around our neighbor's house... He looked 
very suspicious, checking the windows, the doors.. 
Fortunately it turned out it was our neighbor who had 
forgotten his key! 


Option 1: this guy 


Option 2: a guy 


Figure 2: Example of a specific/backgrounded item 


language other than English. The participants were given no time limit but all of 
them completed the experiment in under five minutes. 

Table 7 summarizes the results of the 22 participants on the test items. L2 
learners are at ceiling in the specific/foregrounded condition but produce 31% of 
definites in the non-specific/backgrounded condition. 


Table 7: Percentage of a and the responses by 22 Mandarin respondents 
in the foregrounded and backgrounded conditions in Le Bruyn & Dong 
(2017b) 


a the 


Non-specific 69% (53/88) 31% (27/88) 
Specific 95% (75/88) 5% (5/88) 


To determine the significance of these results, we ran a mixed effects model 
with item and participant as random factors. There was a significant effect of 
condition. Pairwise comparisons of the model showed that the foregrounded 
and the backgrounded conditions were significantly different from each other 
(t(174) = 4.576, p < 0.001). 

The results indicate that our participants were likelier to produce a definite for 
non-specific referents than for specific referents. This is exactly the opposite of 
what we would expect based on the Fluctuation Hypothesis. In combination with 
the data from Ting (2005) and the data I presented in 84.1, we conclude that ev- 
idence is accumulating that suggests Mandarin learners of English are different 
from learners with other article-less L1s in that they do not unequivocally bear 
out the predictions of the Fluctuation Hypothesis. In $5, I propose a research 
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program that aims at establishing L1 influence in article acquisition for learners 
with an article-less L1. I approach articles as a syntax/semantics interface phe- 
nomenon. The setup of the program allows it to be adapted to study L1 influence 
for other phenomena at the syntax/semantics interface. 


5 Establishing L1 influence: A research program 


Jarvis (2000) set the current standard in transfer research. In order to argue for 
transfer from L1 to L2, he requires a research design with learners from multi- 
ple L1 backgrounds that convincingly shows that: (i) learners with the same L1 
background pattern together (intragroup homogeneity), (ii) learners from differ- 
ent L1 backgrounds behave differently (intergroup heterogeneity), and (iii) differ- 
ences between the groups are linked to differences in their Lis (cross-linguistic 
congruity). 

Demonstrating cross-linguistic congruity presupposes cross-linguistic com- 
parison, the study ofthe many-to-many mapping patterns between the syntax/se- 
mantics interfaces (SSIs) of L1s and Target Languages (TLs). For this comparative 
groundwork, SLA researchers should be able to rely on syntacticians/semanti- 
cists. Current work on transfer for articles however shows that the available 
groundwork will not do. In 83.1, it was shown that Snape et al. (2006) found 
that Mandarin learners of English outperform Japanese learners on their acqui- 
sition of the English article system. They conjecture that this is due to the fact 
that the Mandarin demonstrative nei and numeral yi (‘one’) are close to English 
the and a. If they are right, this entails that the meanings of demonstratives and 
numerals in Mandarin and Japanese partly overlap and partly do not and that 
their relations to demonstratives, numerals and articles in English are different. 
A full argumentation for transfer would then need to focus on those contexts for 
which Mandarin and Japanese differ in their use of demonstratives or numerals. 
There is however no work in cross-linguistic syntax/semantics with this level of 
granularity that transfer research can build on. 

The example from Snape et al. (2006) shows a realistic picture of cross-linguis- 
tic syntax/semantics. Too often, two simplifying assumptions are made: (i) things 
that superficially look the same are the same (e.g. numerals, demonstratives), (ii) 
languages either make the same distinctions or are underspecified (definiteness) 
without there being a (combined) role for other expressions. These simplifications 
are a limitation in cross-linguistic syntax/semantics. The first challenge which a 
systematic study of L1 influence in article acquisition faces is thus to force a 
paradigm shift in cross-linguistic syntax/semantics that gives transfer research 
the groundwork it needs (the comparative challenge). 
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The example from Snape et al. (2006) is also indicative of another challenge 
the field faces. Transfer research at the SSI is too often synonymous with L2 
morpheme studies. This is a reductionist view in two respects. The first is that 
the SSI is not a mere sum of morphemes but a system in which all morphemes 
interact. The second is that the SSI of L2 learners can only be properly understood 
if we model it as a system in which the SSIs of the learner's L1 and TL come 
together. We need methodology that allows us to do justice to the full complexity 
of the SSI of L2 learners (the L2 interface challenge). Meeting this challenge allows 
us to compare the SSIs of L2 learners from the same L1 background and across 
learner groups (intragroup homogeneity, intergroup heterogeneity) while at the 
same time comparing them to the L1s and TL of the learners (cross-linguistic 
congruity). 


5.1 Iterated Translation Mining 


Iterated Translation Mining (ITM) overcomes the comparative challenge through 
the adoption of a data-driven approach in which translation equivalents are used 
to identify the semantic features that interact with definiteness and study how 
they are realized cross-linguistically. The output is - for each language - an anal- 
ysis of the SSI of definiteness in the nominal domain. The formalization includes 
an overview of lexical items/constructions with their associated features (hence- 
forth feature-based lexicons) and the rules that govern their use in each of the 
languages (henceforth grammars). To be able to guarantee cross-linguistic com- 
parability, I adopt formal semantics to define the semantic features and I set up 
the grammars in (Bi-directional) Optimality Theory (Prince & Smolensky 2004; 
Hendriks et al. 2010). Monolingual reference corpus and native speaker experi- 
ments allow to overcome the limitations inherent to a corpus-driven approach. 


5.1.1 Data 


ITM uses translation corpora to generate networks of translation equivalents 
across languages." For example, one takes a and the as seed words, looks up 
their uses in the English source texts and matches their translations. These can 
be demonstratives, specific word orders, case configurations, etc. As a second 
step, one looks up all uses of the translations of a and the in the source and 
target texts and matches the translations of these in all the languages of the cor- 
pus. The first step creates one-way contrastive analyses focusing on how English 


7A reviewer correctly points out that the parallel methodology severely restricts the number of 
languages that can be investigated. I hope this is however only a matter of time in the sense 
that parallel corpora will hopefully become available for many more languages. 
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the a 


nei bare noun yi 


Figure 3: TM 


nominal definiteness is rendered in the other languages. The second step creates a 
many-to-many contrastive analysis that gives access to the paradigms of nominal 
definiteness cross-linguistically with an equal weight for the different languages. 

The output of the data collection is a set of contexts with - for every lan- 
guage - an indication of the markers of definiteness. Multi-Dimensional Scal- 
ing (MDS) automatically generates clusters of contexts by maximizing the dis- 
tances between contexts in which (individual) languages use different markers 
and minimizing the distances between contexts in which the same markers are 
used (Hamming distance). Based on Analyses of Similarities (Clarke 1993; Oksa- 
nen et al. 2017), I determine the significance of these clusters. The combination 
of the clusters and the contexts that appear in them is an inductively construed 
semantic map (Haspelmath 1997), the basis for our cross-linguistic analyses. It 
furthermore allows to shift the focus of transfer research from morphemes to 
the full SSI. 

ITM introduces iterations in the Translation Mining technique (TM) I designed 
with Henriétte de Swart and Martijn van der Klis (van der Klis et al. 2017). 


5.1.2 Analysis 


The way the analysis proceeds is close to the one in TM (e.g. de Swart et al. 2017). 
Lillustrate with an example in which I apply TM and ITM to the same (hypotheti- 
cal) dataset. I restrict my attention to two languages (English and Mandarin) and 
to a subset of the variation I expect to find. 

The points in Figures 3 and 4 represent contexts from a translation corpus. 
Their colours refer to the forms in English (upper), the coloured groupings to 
the forms in Mandarin (lower). The clusters that emerge by crossing the form 
variation in the two languages are numbered. 
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that the a this one 


De 2 © 
a o 


nei bare noun yi 


Figure 4: ITM 


By inspecting commonalities and differences between clusters, I identify the 
semantic features at play and the constraints that govern their use. The fea- 
tures are formalized in feature-based lexicons, the constraints in bi-directional 
OT grammars. TM presents the picture we know from the literature: Mandarin 
doesn't have articles and uses bare nouns instead with an occasional use of 
demonstratives like nei (‘that’) for definites and the numeral yi (‘one’) for indefi- 
nites. ITM provides the fuller picture we need by translating back the translations 
of the and a and providing the relevant oppositions to study the contribution of 
the and a when they are not translated by a bare noun and the contribution of nei 
and yi when they do function as translations of the and a. Adding more article- 
less languages (like Japanese and Russian) as well as all the iterations, allows to 
complete this picture (different distribution of bare nouns, demonstratives, nu- 
merals, case, word order, etc.). The increased complexity is managed through 
so-called scenarios that plot subparts of the data and allow a stepwise analysis 
of the full picture. The renewed interest in variations of definiteness across lan- 
guages - not in the least due to Florian Schwarz's work (2009; 2019 [this volume]) 
- will undoubtedly contribute to the analysis. 


5.2 LOG-IT 


LOG-IT (Logging Lexicons and OT Grammars in Translation) is a data mining 
technique that uses a custom-made high quality L1 to L2 translation corpus to 
inductively study the SSI of individual learners at the same level of detail as the 
output of ITM. It thus overcomes the L2 interface challenge. I use the output of 
ITM in two ways. The clusters identified through ITM guide the selection of con- 
texts for the translation corpus. For the analysis, I use the ITM feature-based lexi- 
cons and OT rules to generate all possible variations on the languages involved. I 
compare these to the production of the learners and establish individual rankings 
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of these variations. I establish similar rankings for the languages of the project 
based on our corpus and experimental data. The rankings allow for the mapping 
and comparison of the SSI of individual learners, L1 groups and L1/L2s. 


5.2.1 Data 


I chose written L1 to L2 text translation as a data collection protocol for two 
reasons: 


(a) Data that discriminate between possible rankings are needed. (Semi)-free 
production tasks cannot target all relevant data per learner. 


(b) Unlike other high control tasks like Forced Choice Elicitation, translation 
can focus on any level of production (DP/VP, sentence, discourse). 


Relying on translation data comes with two risks. The first is a translation bias: 
learners might be influenced by specific wordings in the source text or resort to 
general translation processes like simplification. To address this bias, I include 
two control tasks: a story rewrite task to control for influence from the source 
text and L2 to L1 translation to control for translation styles. The second risk is 
overinterpretation of the data: doubts of the learners are not visible in a transla- 
tion and learners might resort to a word-by-word or sentence-by-sentence strat- 
egy while I hope to analyze all levels of production. To address this risk, I exploit 
the potential of simultaneous key-stroke logging and eye-tracking during trans- 
lation. I use a combination of measures related to corrections, eye-key spans 
(Timarová et al. 2011 and references therein), attention units (e.g. Hvelplund 
2016), etc., to establish a measure of reliability per data point. The relevant ex- 
perimental software goes under the name of TRANSLOG II and was developed 
in the field of Translation Studies (Schwieter & Ferreira 2017). 


5.2.2 Analysis 


I use the semantic features and OT grammar constraints identified through ITM 
to generate all possible lexical entries for the forms used by the learners and all 
possible OT grammars. By crossing lexicons and grammars, I generate all possi- 
ble variations on the languages involved and rank these per learner. Rankings 
are based on how accurately the variations predict learner production and cor- 
pus/experimental data. Accuracy is established as a measure of (weighted) inter- 
rater reliability where the output of the learner and the variation are modeled as 
raters. 


215 


Bert Le Bruyn 


The distances between learner/language rankings are calculated based on the 
Damerau-Levenshtein distance and a dissimilarity matrix is established. This is 
the input for Analyses of Similarities that statistically assess intra-group homo- 
gene-ity/inter-group heterogeneity for the L1 groups. I use MDS to graphically 
represent similarities and differences between individual learners, learner groups 
and languages (Figure 5). In combination with the underlying rankings, the corre- 
sponding graph is an inductively constructed map of L1influence. The underlying 
data allow to establish cross-linguistic congruity. 


SZ See e 

ee e 

e ei TE Tee 

ee o°? ee 

o a” e o 
e e e e 
e Ae e? .* 
e ge " ee 


Figure 5: LOG-IT 


Characterizing learners in terms of rankings of interlanguages does justice to 
the variability that characterizes learner languages (Larsen-Freeman 2006; de Bot 
et al. 2007). The logic behind LOG-IT allows it to deal with L2s and L3s provided 
the languages of the learner are included in ITM. 


6 Conclusion 


I have presented evidence suggesting that article-less languages are not created 
equal and that this influences how native speakers of these languages acquire 
article languages like English. The evidence suggests that Mandarin learners of 
English do not unequivocally bear out the predictions of the Fluctuation Hypoth- 
esis, unlike learners of English with e.g. Korean, Russian and Japanese as an L1. 
I have proposed a research program that approaches articles as a syntax/se- 
mantics interface phenomenon. The program considers the syntax/semantics in- 
terface of definiteness in its entirety and makes no a priori assumptions about 
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how it is best analysed. Rather, it adopts a data-driven comparative approach 
with multiple Lis that allows for a fine-grained answer to the question of how L1 
influence plays out for definiteness. 
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Chapter 7 
Licensing D in classifier languages and 
"numeral blocking" 


David Hall 
Queen Mary University of London 


Since Cheng & Sybesma (1999), there has been much discussion of how the inter- 
action of functional heads in the extended nominal projection in numeral classifier 
languages gives rise to a definite interpretation. An important observation that 
came out of this discussion is that there appears to be some kind of interaction 
between a classifier head (call it CI) and definiteness, where either Cl and D inter- 
act through head movement (Simpson 2005), or the Cl head itself introduces an 
ı-operator. Cheng & Sybesma note that in Cantonese, which exhibits bare CI-N se- 
quences with a definite interpretation, the addition of a numeral has the effect of 
"undoing the definiteness". The standard approach to accounting for this blocking 
of definiteness is that of Simpson (2005), where it is suggested that for a definite 
interpretation to arise in classifier languages, the Cl head has to move to D (in the 
spirit of Longobardi 1994). The blocking of a definite interpretation in Cantonese is 
the result of a Head Movement Constraint violation; Cl cannot move to D over the 
numeral. I show that this numeral blocking effect extends to other languages too, 
and I argue based on data from those languages that a Head Movement Constraint 
based account of definiteness in classifier languages cannot capture the facts, and 
that we require an alternative. I put forward a proposal which has the consequence 
that the classifier and numeral form a constituent to the exclusion of the noun, and 
then discuss some suggestive evidence in favour of such a structural configuration. 


1 Introduction 


A much discussed question related to numeral classifier languages! is how they 
encode definiteness, and whether there are differences among classifier lan- 


Throughout I use the term classifier languages to mean numeral classifier languages. 
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guages with respect to this property. Cheng & Sybesma (1999) was an early at- 
tempt to systematically provide a syntactico-semantic explanation for differen- 
ces observed between Mandarin Chinese (henceforth MC) and Cantonese, with 
respect to the noun phrase configurations which give rise to a definite interpre- 
tation. Cantonese exhibits noun phrases composed of a bare classifier? followed 
by a noun (CI-N phrases), which can be interpreted as a definite noun phrase, 
whereas MC only allows an indefinite interpretation for CI-N phrases. Further- 
more, in both languages, the presence of a numeral always forces an indefinite 
interpretation, regardless of whether CI-N can be definite in that language. 

In this paper I discuss the standard explanation for the definite interpretation 
associated with bare classifiers in Cantonese, and the related explanation for the 
"blocking" effect that the numeral has on definiteness, which has previously been 
tied to the Head Movement Constraint (HMC). I show that the numeral blocking 
effect extends to other classifier languages, including two languages where there 
is an overt morphological instantiation of definiteness on the classifier. I then 
argue that the standard HMC explanation of numeral blocking does not work in 
light of morphological facts from one of these languages, under a certain set of 
well-motivated assumptions about the structure of the DP. I ultimately conclude 
that a revised analysis, involving two separate structures for CI-N phrases and 
phrases with a numeral is required, and that a consequence of this analysis, that 
numerals form a constituent with the classifier to the exclusion of the noun, is 
supported by typological evidence related to word order in classifier languages. 

In the next section I introduce the relevant data from MC and Cantonese, be- 
fore introducing the analyses in Cheng & Sybesma (1999) and Simpson (2005).? 


2 Definiteness in Mandarin Chinese and Cantonese 


Both Mandarin Chinese (MC) and Cantonese are what I will refer to as classifier 
languages, that is, languages which employ a set of morphemes to categorize or 
classify the noun that they co-occur with. The classifiers discussed here are some- 
times referred to as Numeral Classifiers (Aikhenvald 2000), particularly given 
that they obligatorily appear when a numeral is present. Both languages allow 
bare nouns, noun phrases composed of a classifier-noun sequence (CI-N phrases) 
and noun phrases composed of a numeral-classifier-noun sequence (#-CI-N? 


? Bare here is intended to indicate the absence of a numeral. Many classifier languages, such as 
Japanese, disallow classifiers where no numeral is present. 

3Much of the paper is a revised version of parts of $4 and §5 of Hall (2015). 

"Throughout, I will use # as an abbreviation for numeral. 
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phrases) in argument position. However, there are a number of interesting con- 
straints on where each type of noun phrase can appear. Furthermore, these con- 
straints differ between the two languages, as discussed in depth in Cheng & Sy- 
besma (1999). 

Overall, the possible interpretations available to different noun phrases in MC 
and Cantonese depend on the shape of the noun phrase: in particular, whether 
it is a bare N, a CI-N, or a #-CI-N. Jenks (2012) points out that the difference be- 
tween MC and Cantonese noun phrase distribution and interpretation can be sub- 
sumed under a larger generalization that appears to hold quite robustly across a 
number of Sino-Tibetan and Austroasiatic classifier languages, including Hmong, 
Cantonese, MC, Min, and Vietnamese.” The generalization takes the form of two 
one-way entailments: if a classifier language has bare nouns which can be in- 
terpreted as definite, then CI-N phrases will not be interpreted as definite; if a 
classifier language has CI-N phrases which can be interpreted as definite, then 
bare nouns will not be interpreted as definite. 


(1) Noun phrase interpretation in classifier languages 
a. Bare N [+def] — CI-N [-def] Type A language 
b. CI-N [+def] — Bare N [-def] Type B language 


MC is a Type A language: it exhibits definite bare nouns and CI-N phrases 
which are obligatorily indefinite. Cantonese is a Type B language: it has definite 
CI-N phrases and obligatorily indefinite bare nouns. Another generalization that 
can be added to the above is that, regardless of the availability of a definite inter- 
pretation for a CI-N phrase, the presence of a numeral always blocks a definite 
interpretation. 


(2 #-CI-N [-def] Type A&B languages 


My focus in this paper is on Type B languages; in particular on the definite 
interpretation associated with CI-N phrases, and the reasons why (2) holds in 
those languages. In the next subsection I lay out the full set of facts related to 
MC and Cantonese, before introducing two previous analyses of the differences 
between the two languages. 


"Note that Trinh (2011) claims that bare nouns cannot be definite in Vietnamese, but Nguyen 
(2004) and Jenks claim otherwise. See also Simpson et al. (2011) for a challenge to the comple- 
mentarity of definite bare Ns and definite CI-N phrases. 

$We will see an example of a language in $4.1, Wenzhou Wu, which is a counter-example to 
this generalization. 
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2.4 Mandarin Chinese - a Type A classifier language 


MC is a Type A classifier language (following the generalization in 1).’ In postver- 
bal object position, bare nouns can have either definite or indefinite interpreta- 
tion whereas in preverbal subject position (or topic position), bare nouns cannot 
be interpreted as indefinite (3a), because of a general restriction on the preverbal 
subject position which means that indefinite noun phrases cannot appear there 
(Huang et al. 2009: 288 and references cited therein). Noun phrases with a demon- 
strative are also acceptable in preverbal subject position (3b), and can take on an 
anaphoric definite interpretation (in the sense of Schwarz 2009; see Jenks 2015). 


(3 a. Gouchi-le dangao. 
dog eat-PRF cake 


"Ihe dog ate the cake/a cake? NOT ‘a dog...’ 


b. Nei-zhi gou chi-le  dangao. 
that-c dog eat-prF cake 


"Ihat/the dog ate the cake/a cake’ 


Bare count nouns are number neutral, and thus can refer to either singular 
objects or pluralities. Bare nouns can also refer to mass objects (examples taken 
from Cheng & Sybesma 1999, with some modification): 


(4 a. Hufei mai shu qu le. 
Hufei buy book go SFP 


‘Hufei went to buy a book/books/the book(s). 


b. Hufei he-wan-le tang. 
Hufei drink-finish-PRF soup 


‘Hufei drank the soup/some soup. 


"Note that throughout I discuss sortal classifiers, and not mensural classifiers, or “massifiers” to 
use Cheng & Sybesma’s (1998) term. I believe that massifiers have a different structure, which 
is evidenced by their different properties (a modifier can appear between the massifier and the 
noun, a modification marker de is optionally present). See Cheng & Sybesma (1998) and Cheng 
& Sybesma (1999) for discussion. 

*Judgements on example sentences are taken directly from the literature, unless otherwise 
stated. 

?| focus here on definite and indefinite interpretations, and put aside kind and generic interpre- 
tations, which bare nouns can also take on. For discussion of kind and generic interpretations 
in MC, see Krifka (1995). 
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Where a noun is accompanied by a numeral, a classifier is obligatorily present 
(5),? and the #-CI-N phrase is obligatorily indefinite. CI-N phrases are also pos- 
sible without a numeral, and are obligatorily indefinite and singular (6).!! Because 
of the “definiteness constraint” on preverbal subject position, CI-N and #-CI-N 
phrases are degraded in this position (7). 


(5 Wo xiang mai liang *(ben) shu. 
I want buytwo cr book 


‘I want to buy two books. 


(6) Wo xiang mai ben shu. 
I want buy ct, book 


‘I want to buy a book’ NOT ‘I want to buy (some) books: 


(7 a. ?? San-ge xuesheng chi-le dangao. 
three-cL student eat-PRF cake 


Intended: “Three students ate the cake: 


b. * Ge xuesheng chi-le | dangao. 
CL student eat-PRF cake 


Intended: ‘A student ate the cake: 


2.2 Cantonese - a Type B classifier language 


Cantonese is a Type B classifier language (following the generalization in 1). In 
postverbal object position, CI-N phrases can have either definite or indefinite 
interpretation (8) whereas in preverbal subject position (or topic position), CI-N 
phrases can only be definite (9). As with MC, CI-N phrases are always singular.’ 
Bare nouns, on the other hand, are obligatorily indefinite (thus being unaccept- 
able in preverbal subject position, 9a), and are number neutral. Examples here 
are again taken from Cheng & Sybesma (1999). 
(8) Ngo? soeng? maai? bun? syu! (lei?  tai?). 
I want buy «cr book come read 


‘I want to buy a book (to read): 


1 Although see Tao (2006) for a discussion of the phenomenon of classifier reduction (of the 
general classifier ge) in spoken Beijing Mandarin Chinese. 

"A possible exception is the classifier-like plural marking element xie, which I put aside here. 
See Hall (2015: 84.2.3) for discussion. 

? Again, this is with the exception of nouns that appear with the “plural classifier” di’, which I 
discuss in Hall (2015: 84.2.3). 

BSuperscript numbers on Cantonese examples indicate tone. 
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(9) a. * Gau? soeng? gwo? maa?lou*. 
dog want cross road 
Intended: "Ihe dog wants to cross the road 


b. Zek gau? soeng? gwo? maa?lou°. 
CL dog want cross road 


"Ihe dog wants to cross the road’, NOT ‘a dog...’ 


(10) Wufei heoi? maai? syu'. 
Wufeigo buy book 
"Wufei went to buy a book/books. 


As with MC, #-CI-N phrases are always interpreted as indefinite, and thus 
are infelicitous in preverbal subject or topic position (examples elicited from a 
native Cantonese speaking informant). Here I include a CI-N phrase (which gets 
a definite interpretation) for contrast. 


(11) a. Zek? gau? sik-gan? juk°. 
CL dog eat-PROG meat 
"Ihe dog is eating meat: 
b. * Loeng?-zek? gau? siké-gan? juk‘. 
two-CL dog eat-PROG meat 


Intended: “The two dogs are eating meat. 


2.3 Summary 


In summary, we have the set of interpretations in Table 1, associated with partic- 
ular noun phrase configurations, available in the two languages. 

What is important here is that we have a language, i.e. Cantonese, where a 
definite interpretation is possible in a noun phrase composed of a bare classifier 
followed by a noun, but where the introduction of a numeral always blocks a def- 
inite interpretation. An account of the interpretive differences in noun phrases 
between the two languages will focus on two facts: 


1. CI-N can be definite in Cantonese, but not in MC. 


2. #-CI-N is always indefinite in both languages. 


In the next section I introduce two previous accounts of these facts. 
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Table 1: Summary of 82 


Noun phrase config. Definite Indefinite Number 


MC 

N 3 3 Neutral 

CI-N xi 3 Sg 

#-Cl-N E 3 Sg/Pl (# dependent) 
Cantonese 

N 5 3 Neutral 

CI-N 3 3 Sg 

#-Cl-N 5 3 Sg/Pl (# dependent) 


3 Previous accounts 


3.1 Cheng & Sybesma (1999) 


Cheng & Sybesma (1999) offered the first account of the above distribution of 
interpretations across different noun phrase configurations. They argue that the 
Cl head in MC and Cantonese plays the (semantic) role that D does in English, 
that of introducing a definite interpretation through an iota operator. Following 
Chierchia (1998b), this is introduced either directly as a definite classifier, as in 
Cantonese, or as a type-shifting last resort operator where no definite lexical item 
is available, as in MC. Cheng & Sybesma also propose that a necessary step for 
the last resort type-shifting in MC is N-to-Cl movement, which is why bare Ns 
can have a definite interpretation in that language. So, in Cantonese, the classifier 
is an overt definite article, giving definite CI-N phrases, and in MC, N moves to 
the empty Cl projection, giving definite bare nouns. 


“Cheng & Sybesma accept that this movement would result in an illicit ordering of the adjective 
and noun, if the adjective merges lower than Cl, and the noun moves up to Cl: 


(i) Predicted order: N- Adj 
CIP 


Cl; [def] NP 


ge 


AP — Nuaeg 


They therefore claim that the movement has to be covert. 
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(12) MC (13) Cantonese 
CIP CIP 
Clonage] N Cluge] N 


pm 


Simply put then, the difference between MC and Cantonese lies in how the 
definiteness "feature" encoded in the CI head is licensed. The fact that numerals 
block definiteness in both languages is argued to arise from the fact that all in- 
definite CI-N phrases involve the projection of a Numeral head above CIP, as in 
(14). 


(14) Indefinite CI-N phrase 


NumeralP 


Pa 


Numeral CIP 


a 


C] NP 


N 


Numerals are claimed to fundamentally involve existential quantification, and 
therefore the merger of a Numeral head has the effect of “undoing the definite- 
ness" (Cheng & Sybesma 1999: 528). From the perspective of compositional se- 
mantics, however, this doesn't entirely make sense. In the system proposed in 
Chierchia (1998b) (based ultimately on Partee's 1986 set of type-shifters), the iota- 
operator takes a property and returns a unique individual (of type <e)), whereas 
the existential operator takes a property and returns a generalized quantifier (of 
type «e, t), e>). If we compose the property introduced by N with the iota opera- 
tor first at Cl, then an existential quantifier introduced at Numeral would not be 
able to compose with the resultant individual (of type e). 
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(15) NumeralP?» 


> a 


Numeral;  CIP(5 (ix) 


P di 


Clie, ty, oi | NPq ty 


N 


The individual is bound by the iota operator at the CIP level, meaning that it 
can no longer be quantified over in the way suggested by Cheng & Sybesma.” If, 
on the other hand, the notion of “undoing” of definiteness is intended to mean 
that an iota operator is never present in Cl when a numeral is merged, then 
this becomes a simple stipulation, and a restatement of the facts. Because of the 
inexplicit nature of the explanation, I put aside Cheng & Sybesma's approach to 
Numeral Blocking, and instead focus on a related proposal that builds on Cheng 
& Sybesma's initial insights. The standard account which avoids the problems 
discussed immediately above is developed in Simpson (2005), where the locus 
of definiteness is not Cl, but D, assuming that DPs are universal, even where a 
language does not exhibit overt articles. 


3.2 The DP account 


The DP account of the MC and Cantonese facts is proposed by Simpson (2005), 
(and defended by Wu & Bodomo 2009). Simpson builds on the ideas in Cheng & 
Sybesma (1999), but crucially the account differs in that it takes D to be the locus 
of definiteness, following Longobardi (1994). The central idea is that it is head 
movement of Cl to D in Cantonese that gives rise to the definite interpretation 
of CI-N phrases. Definite D must be overtly instantiated by some lexical element 
to be licensed, and so a lack of movement of the classifier to the D head results 
in an indefinite CI-N configuration. 


Tt is possible to introduce a covert type-shifter (“IDENT” or "Id" in Partee's terms) to take CIP 
from (e to <e, t) so that it could combine with the numeral. This would put us in the position 
of saying that the iota operator applies only to have the type shifted back by the covert partial 
inverse of iota, which is hardly satisfying. It would again in effect be the same as saying that 
"numerals undo definiteness", or that the merger of a numeral must be preceded by composition 
of CIP with a covert operator that undoes definiteness. 
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(16) Cantonese CI-N [+def] (17) Cantonese CI-N [-def] 
DP DP 
Dr, degt CIP D CIP 
Cli+def] NP C] NP 


In MC, this movement is not available, presumably because the Cl does not 
come with a definiteness feature. This means that a bare CI-N phrase never re- 
ceives a definite interpretation. 

An advantage of this head movement approach is that it can straightforwardly 
account for the fact that numerals block definiteness in Cantonese, without any 
awkward stipulations. Although the exact syntactic position of the numeralis not 
explicitly discussed in Simpson (2005), the discussion suggests that the numeral 
is introduced as a head above CIP. This means that the Numeral head will act as 
an intervenor for Cl-to-D movement, as per the Head Movement Constraint of 


Travis (1984), and will therefore block a definite interpretation. 


(18) The Head Movement Constraint (HMC) 
An X? may only move into the Y? which properly governs it. 


(19) DP 


m p m 


Dr det! NumeralP 


Numeral CIP 
(intervenor) 


l*There is no discussion of how bare nouns get a definite interpretation under this analysis: 
however it has been suggested that it involves N-to-D movement of the type discussed in 
Longobardi (1994), although with common nouns, not just proper nouns. Such an analysis has 
problems of its own, but I will not discuss them here for reasons of space. See footnote 20 for 
further discussion. 
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This is a simple and elegant explanation of the numeral blocking effect. No 
stipulation of the “undoing of definiteness" is required, and we have a straight- 
forward explanation in terms of locality and the interaction of syntactic features 
and interpretation. However, I intend to argue that it is not the simplest account, 
based on certain well-motivated assumptions about the structure of the DP, and 
facts from other classifier languages. 

In the next section I will show that numerals blocking definiteness is not a 
peculiarity of Cantonese, and in fact extends to other classifier languages. Fur- 
thermore, morphological facts from one language in particular, Weining Ahmao, 
suggest that the simple HMC explanation of the Numeral Blocking effect pro- 
posed by Simpson could not be correct, and in order to explain the full set of 
typological facts, two different structures will be proposed for #-CI-N and bare 
CI-N phrases. 


4 Numerals block definiteness: Cross-linguistic 
considerations 


The blocking effect of numerals is a general effect that can be seen in other clas- 
sifier languages. Cantonese classifiers are able to signal definiteness without any 
difference in the morphological shape of the classifier. That is to say, a CI-N se- 
quence is interpreted as either definite or indefinite depending on context, rather 
than the shape of the classifier which accompanies the noun. This is also true of 
other classifier languages, including Vietnamese and Nung. However, there are 
classifier languages spoken in China which exhibit "inflecting" classifiers; that is, 
classifiers whose morphology encodes different interpretive features of the noun 
phrase. The striking fact about those languages is that, even though definiteness 
can be overtly marked on the classifier, the presence of a numeral always blocks 
definiteness, and prevents the definite form of the classifier from being used. I 
give a description of the classifier morphology of two languages which exhibit 
inflecting classifiers in the following subsections, and show that these languages 
also appear to exhibit the same numeral blocking effect as Cantonese. 


4.1 Wenzhou Wu 


The southern Wu variety spoken in Wenzhou is a local dialect of one of the ten 
major varieties of Chinese, Wu. Cheng & Sybesma (2005) discuss the different in- 
terpretive possibilities for different noun phrase configurations in four varieties 
of Chinese, including Wenzhou Wu (WW). They note that WW bare nouns have 
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the same distribution as MC bare nouns, in that they can be either definite or 
indefinite in object position, and can only be interpreted as definite in subject 
position. 

CI-N phrases, however, differ from both MC and Cantonese. While WW is 
similar to Cantonese in allowing a definite interpretation for CI-N phrases, it 
differs from Cantonese in that a definite interpretation for a CI-N phrase is sig- 
nalled by a shift in the tone of the classifier. As Cheng & Sybesma (2005) discuss 
in detail, the eight lexical tones of the language can be divided into four sub- 
groups (A, B, C, and D), each subgroup containing two register subclasses, ‘hi’ 
and ‘lo’. I reproduce Table 2 presenting the tone values for each lexical tone here 
(contour values taken from Norman 1988). 


Table 2: Lexical tones of Wenzhou Wu 


1:hi-A  2:1o-A 3: hi-B 4: lo-B 5:hi-C 6:lo-C 7:hi-D  8:1o-D 


44 31 45 (abrupt) 24 (abrupt) 42 11 23 12 


In an indefinite noun phrase containing a classifier, the classifier carries its un- 
derlying, lexically specified tone. However, when the tone of the classifier shifts 
to a D tone (no matter what the underlying lexical tone of that particular clas- 
sifier is), the CI-N phrase is interpreted as definite. Thus, when definite, hi-A 
(tone 1), hi-B (tone 3), hi-C (tone 5) all shift to hi-D (tone 7), and hi-D (tone 8) 
also surfaces as hi-D. Lo-A (tone 2), lo-B (tone 4), lo-C (tone 6) and lo-D (tone 
8) all surface as lo-D. A change in the morphology of the classifier gives rise to 
a change in interpretation. A minimal pair can be shown for a CI-N phrase in 
object position (20), where a CI-N phrase is acceptable under both a definite and 
an indefinite reading, the difference in meaning being indicated only by the tone 
on the classifier. 


(20 a. gei  ma*pag? si! 
I want buy CLp-tone book 


‘I want to buy a book’ 


b. gei mei pan’ sil 


I want buy CLp-tone book 
‘I want to buy the book’ 


Because of a ban on indefinite preverbal subjects (similar to that of MC and 
Cantonese), CI-N phrases in subject position with an underlying “indefinite” 
classifier tone (i.e. any non-D tone) are unacceptable: 
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(21) a. *dyu? kou tg . tsaw-kw kolleg 
CLA-tone dog want walk-cross street 
Intended: ‘A dog wants to cross the street. 
b. dyu kau?  tsaw-ku? kallgy? 
CLp.tone dog want walk-cross street 


"Ihe dog wants to cross the street. 


As shown by the example in (21b), a D-tone alternative is well formed, but 
produces a definite interpretation. 

What about when numerals are combined with CI-N phrases? Cheng & Sy- 
besma (2005) point out that classifiers preceded by numerals keep their under- 
lying tone, and #-CI-N phrases are necessarily interpreted as indefinite. That is, 
definite morphology on the classifier is blocked when a numeral merges, and a 
#-CI-N phrase cannot have a definite interpretation. 


(22) gei matn pay’ si! le tsh? 
I want buy four CLp-tone book come read 


‘I want to buy four books to read. 


This is another example of a case where the ability of a classifier to encode 
definiteness is blocked by a numeral, but where there is an overt morphological 
reflex of definiteness. 


4.2 Weining Ahmao 


A second, and here crucial example of “inflecting” classifiers is the fascinating 
case of Weining Ahmao (Gerner & Bisang 2008; 2010). A Miao-Yao language 
spoken in western Guizhou province, Weining Ahmao (WA) encodes not only 
definiteness, but also number and 'size' (diminutive, medial and augmentative) 
on the classifier. The function of the 'size' inflection goes beyond encoding literal 
size; it mainly carries a socio-pragmatic function whereby the particular choice 
of classifier form indexes the gender and age of the speaker. 


"The only other vaguely similar socio-pragmatic classifier function that I am aware of is exhib- 
ited in Assamese, where there are four separate classifiers for humans, but which differ with 
respect to the status of the human that is being referred to (Aikhenvald 2000: 102-103): 


Table i: Assamese classifiers for humans 


Human males of Female animals; High-status humans Humans of either sex 
normal rank human females of any sex (respectful) 
(respectful) (disrespectful) 

zon zoni zona goraki 
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Male speakers typically use augmentative forms of the classifier, female speak- 
ers the medial form, and children the diminutive form. Although this third aspect 
of classifiers in the language is particularly rare and interesting, I put aside discus- 
sion of the socio-pragmatic facts here, and concentrate instead on number and 
definiteness; I direct the reader to Gerner & Bisang (2008; 2010) for an in-depth 
discussion of the socio-pragmatic nuances of classifier use in the language. 

Table 3 gives the abstract summary of the forms of classifiers in Weining Ah- 
mao that Gerner & Bisang (2008: 721) produce. 


Table 3: Summary of the forms of classifiers in Weining Ahmao 


Singular Plural 
Gender/Age Size Definite Indefinite Definite Indefinite 
Male Augmentative CVT C*VT Weien dila C*vT' 
Female Medial Cor" Ca tiaia CVT dia? a C* VT' 
Children Diminutive Ca” Cig? tiaa CVT — dia” al!C*VT’ 


Taking the augmentative (male) form to be the base form, C stands for simple, 
double or affricated consonant, V stands for simple or double vowel, T stands for 
tone, and the superscript numbers represent relative pitch on a scale from 1 (low- 
est) to 5 (highest). T’ indicates an altered tone from T, and * indicates a supraseg- 
mental change in the consonant, such as aspiration or devoicing, although there 
is also sometimes an absence of sound changes. To illustrate the application of 
this abstract schema with a concrete example from the language, we take the 
classifier for animacy, tu** (Gerner & Bisang 2008: 722), shown in Table 4. 


Table 4: Inflection of tu^ 


Singular Plural 
Gender/Age Size Definite Indefinite Definite Indefinite 
Male Augmentative tu^ du?! tial tutt dila” tutt 
Female Medial tait dai?'? tiai” atut dia Paru? 
Children Diminutive tat da” tia?g!!tyf diaa tutt 


As an example, (23) shows the four ways a male (adult) speaker can refer to 
oxen, with differences in number and definiteness being encoded solely on the 
classifier. 
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tutt nâu” 
CL.AUG.SG.DEF OX 

‘the ox’ 

du?! nhu” 


CL.AUG.SG.INDEF OX 
‘an ox? 

ti? ql gt nhu” 
CL.AUG.PL.DEF OX 

*the oxen' 

didla tyt nhu” 
CL.AUG.PL.INDEF OX 


'(some) oxen’ 


Interestingly, constructions involving numerals are always interpreted as in- 
definite, and when a numeral (including numerals greater than ‘one’) is present, 
both definite forms and plural forms of the classifier are ungrammatical. A nu- 
meral therefore must occur only with an indefinite singular classifier (regardless 
of ‘size’): all other combinations are ungrammatical (Gerner & Bisang 2010: 588). 


(24) a. 
b. 
(25) a. 
b. 
(26) a. 


* 55 orl nâu” 
one CL.MED.SG.DEF OX 


Intended: ‘the one (sole) ox’ 
j^ dai? nâu” 
one CL.MED.SG.INDEF OX 


€ H 
one ox 


* tsi” la” tau? 
three CL.DIM.SG.DEF hill 


Intended: ‘the three hills’ 


tsi? la” tau” 
three CL.DIM.SG.INDEF hill 
‘three hills’ 


* #59 tially cey? 
three CL.AUG.PL.DEF valley 


Intended: 'the three valleys' 
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b. *ts® diai?"ag![y? ^ gey? 
three CL.MED.PL.INDEF valley 


Intended: 'three valleys' 


The same is true for the quantifier pi” dzau” ‘several’: it can only occur with 
a singular indefinite classifier: 


(27) a. * pi?dzau? dzai” tci? 
several | CL.MED.SG.DEF road 


Intended: ‘the several roads’ 
b. pi” dzau? dzhiai?? tci?» 
several ^ CL.MED.SG.INDEF road 


‘several roads' 


Noun phrases with a demonstrative and a CI-N constituent, on the other hand, 
always take a definite classifier. 


(28) ue. lu” a? v9? vhai” 
CL.AUG.SG.DEF stone DEM:MED 


‘that stone (at medial distance from me)’ 
b. * lu” a? v9? vhai” 


CL.AUG.SG.INDEF stone DEM:MED 


Intended: ‘that stone (at medial distance from me)’ 


This is another example of a classifier language where the coding of definite- 
ness on the classifier is blocked by the presence of a numeral. I now show how 
the facts from Weining Ahmao are problematic for the HMC account of numeral 
blocking, and propose a revised account which can capture all of the relevant 
facts. 


5 Revising the HMC account 


Recall from the previous discussion that we have the following facts to account 
for: 


1. CI-N phrases can have a definite interpretation in some languages, but 
#-CI-N phrases never can. 


2. Classifiers in WW can have overt definiteness morphology. 
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3. Classifiers in WA can have overt number and definiteness morphology. 


4. Classifiers cannot take definite form when a numeral is present in WW 
and WA. 


5. Classifiers in WA are singular in form when a numeral is present. 


Let us assume that number marking is the morphological realisation of a head, 
Num, and that definiteness marking is the morphological realisation of a head, 
D. I further assume here, against the proposal in Simpson (2005), and following 
a number of recent proposals, that numerals merge as specifiers, not as heads 
(Cinque 2005; Borer 2005; Ionin & Matushansky 2006; Ouwayda 2014). 

Further, I assume a standard approach to morphological word formation where 
syntactic operations feed morphological word formation (e.g. Travis 1984; Baker 
1988; Halle & Marantz 1993 among many others),? such that roll-up head move- 
ment and adjunction creates complex heads with complex morphology. Now, 
if we follow Simpson (2005) in assuming that definiteness is licensed in CI-N 
phrases through the movement of Cl to D, then definiteness morphology on clas- 
sifiers in WW, and number and definiteness marking on bare classifiers in WA 
means that successive cyclic head movement of Cl through Num up to D must 
be possible, with the complex head being realised in D.?? This is illustrated in 
(29)?! 


18 The motivations for this assumption come from various facts about complex numerals, and 
number marking related to numerals across languages. I do not have space to go through each 
of the arguments here, and instead simply direct the reader to these references. 

P] put aside here the fact that in recent years the status of head movement as a word forma- 
tion operation has been questioned widely in the literature. See Brody (2000), Abels (2003), 
Matushansky (2006), Roberts (2010), Svenonius (2012), Adger (2013), Hall (2015), among others. 
Also see Hall (2015) for a similar argument about the HMC account of numeral blocking, but 
with a revised account of the facts couched in the language of Brody's Mirror Theory. 

? An anonymous reviewer asks why it has to be Cl that moves to D, and not, say, N, as in Italian. 
This is a really a deep question about how to account for parametric variation, and I do not 
have space to go in to detail here, but for concreteness' sake I am adopting the position that 
feature specifications on functional elements are the locus of variation. This means that there 
is a feature on the classifier (say, udef) which is a goal for Agree with [def] of D, and this Agree 
relation forces the subsequent head movement. N does not move because there is no feature on 
N which forces movement. The question then arises about Mandarin, and N-to-D movement. 
AllIcan say about this is that I do not adopt the position that definite bare nouns in Mandarin 
involve N-to-D movement (Cheng & Sybesma 1999), and in fact think that this is a position 
which has various problems associated with it. See Hall (2015: $4) for further discussion. 

21T leave aside how the relative ordering of the morphemes (Cl, Num and D) is achieved here. 
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(29) DP 


uu NE 


D NumP 


EN EN 


Num D two CIP 
PN FR, 


Cl Num tcı NP 


We are left with evidence in the morphology that head movement through 
these positions is possible. If Cl can move to Num as the morphology suggests, 
and if numerals merge in the specifier of Num, then it should also be possible 
to raise the complex classifier head to D. This movement past the numeral in 
the specifier position would not constitute an HMC violation, as there are no 


intervening heads in the same extended projection. This is shown in (30).?? 
(30) DP 
D NumP 


"EE s 


Num D £P NumP 


"i P s 


C] Num tNum CIP 


BS 


tcı NP 


As we have seen, however, this is not the case. The ability to move over the 
numeral should furthermore naturally extend to Cantonese, but again, it clearly 
does not. We know that the presence of a numeral robustly blocks a definite 
interpretation across all classifier languages, and also definite morphology in 


?Note that, if this movement of Cl to D over the numeral were a possibility, we would also expect 
to see classifiers preceding numerals where the DP is definite, and following the numeral when 
the DP is indefinite, and this is never the case. 
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those languages where it exists. This means that an HMC account of the blocking 
effect could not be right.”° 


5.1 A new approach 


To capture the facts, I maintain the core assumption of Simpson (2005) that it 
is indeed the interaction of Cl and D which gives rise to definite interpretations 
in CI-N configurations, but I further propose that CI-N phrases and 4-CI-N 
phrases have different syntactic structures. In a bare CI-N configuration, the full 
DP takes roughly the same form as that proposed by Simpson: D takes a NumP 
complement which takes a CIP complement which takes an NP complement. Def- 
inite classifiers are the result of movement of the Cl head to D (through Num): I 
implement this through Agree between Def features on the heads, followed by 
roll-up movement (Chomsky 1995). 


(31) DP 


Where the def feature is not present, no movement takes place and the result 
is indefiniteness. 

Where my analysis parts from Simpson (2005) is in the structure of #-Cl-N 
phrases. When a numeral is present, I assume that the classifier forms a con- 
stituent with it, and this constituent merges in the specifier of Num. I assume 
that the numeral is phrasal, and is either a specifier of Cl, or an adjunct to it. 


2 Of course it is possible that Cantonese and WW and WA are all just different, and that the HMC 
account does work for Cantonese, and something else is at work in WW and WA. However, 
we are aiming for an explanation that can cover all of the facts in the simplest way, avoiding 
language specific stipulations where possible. I show in 85.1 that this is possible if we abandon 
the HMC account. 
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(32) DP 


CIP Num 


yx, uu 


# Cl Num NP 


In this configuration, Agree between D and Cl is possible, but movement of Cl 
is blocked because of an independently motivated ban on Head Movement out 
of a specifier (see e.g. Roberts 2010), as illustrated in (33). 


(33) DP 


CIP NumP 


SE E 


#P Cl Num NP 


The blocking effect is therefore not a result of the HMC, and definite plural 
classifiers are therefore fully possible where Cl moves through Num to D, so 
long as a numeral is not present. A further benefit of this approach is that a ban 
on head movement into a specifier also prevents Num from moving into the CIP 
and being realised on Cl. This explains why the classifier appears singular with 
numerals in WA. The Num head has a null spell-out when it does not form a 
complex head with Cl, and the Cl takes a default (singular) spell-out.?* 


*4 Amy Rose Deal (p.c.) asks whether this blocking of definiteness by a numeral might simply 
be the result of the numeral always having existential force, in a similar as way suggested 
by Cheng & Sybesma, and hence that there is no need for a syntactic explanation. A D head 
merged above Num would not be able to pick out a maximal individual because it would have 
already been bound off by the existential quantifier. I note that this could not be the case, as 
#-CI-N sequences can in fact have definite interpretations associated with them with the ad- 
dition of certain other elements higher in the phrase. High adjectival modifiers can give rise 
to definiteness (Adj-4-Cl-N sequences), as can the introduction of a demonstrative above the 
numeral. An anonymous reviewer also points out that the quantifier dou added to #-CI-N in 
subject position gives rise to a definite interpretation (Cheng 2009). This suggests that the intro- 
duction of the numeral does not semantically block the possibility of a definite interpretation. 
See Hall (2015: $4) for discussion. 
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5.2 Summary 


Again, I restate the empirical facts which were to be explained: 


1. CI-N phrases can have a definite interpretation in some languages, but 
#-CI-N phrases never can. 


2. Bare classifiers in WW have overt definiteness morphology. 
3. Bare classifiers in WA have overt number and definiteness morphology. 


4. Classifiers cannot take definite form when a numeral is present in WW 
and WA. 


5. Classifiers in WA are singular in form when a numeral is present. 


Each is now explained under the dual-structure account: Cl can move through 
Num and D creating a complex definite head with complex morphology, if the 
language has overt morphological content associated with these heads. The #- 
CI-N structure containing # and Cl as a constituent means that Cl can't move to 
D, following a ban on head movement out of a specifier, which blocks a definite 
interpretation. Num can't move to Cl, following a ban on head movement into a 
specifier, which blocks plural morphology. Each follows from the dual structure 
proposed, and appealing to these two structures means that the apparent gaps 
left by the HMC approach are filled. 

The two distinct structures for CI-N and 4-CI-N are repeated here in (34-35). 


? An anonymous reviewer suggests that we might expect there to be further syntactic evidence 
that the structures are different in these cases. Currently I have not been able to identify any 
very clear differences aside from those already outlined at the beginning of the paper (i.e. that 
#-CI-N phrases and CI-N phrases have a different distribution with respect to availability in 
subject/topic and object position). One hint at another potential difference comes from another 
comment by the same reviewer. Li (2011) points out that for some MC speakers, it is possible 
to get an adjective to intervene between a numeral and a classifier, in a very restricted set of 
cases: 


(i) Tou shangdai le liang da duo hua. 
headon wear PERF two bigcr flower 


'(She) wore two big flowers on her head 


For the two speakers that I could get to accept the above example as possible, neither could 
do the same with a bare CI-N sequence da duo hua. This is potentially another syntactic dif- 
ference: an adjective can merge in between the numeral and classifier in the structure in (35), 
but it cannot appear in the bare CI-N structure in (34). I accept that this is not knock-down 
evidence of a major syntactic difference, but is at least suggestive. I leave an investigation of 
further differences between the two to future research. 
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(34) CI-N (35) #-CI-N 


DP DP 


Num  CIP CIP Num 


e EN caf 


C] NP # Cl Num NP 


A consequence of this analysis is that numerals form a constituent with the 
classifier to the exclusion of the noun in classifier languages, when a numeral 
is present. This could be seen as a counter-intuitive proposal, and in order to 
fully motivate this approach it is necessary to provide some motivation for the 
existence of the two structures beyond just the facts discussed above. In the next 
section I offer some independent support for the proposed #+Cl constituency. 


6 Classifier and numeral constituency 


There is some debate in the literature on classifiers over whether the classifier 
and numeral form a constituent, and whether this is consistent across all classifier 
languages. The variety of positions can be summarized as follows: 


(36) 
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Classifier and numeral are a complex head (Kawashima 1998). 


b. Classifier is a head in the extended nominal projection (xNP), 


Numeral is a specifier of Cl (Tang 1990; or Cl is Num, numeral is 
specifier: Watanabe 2006). 

Classifier is a head in the xNP, Numeral is a head of NumP (Cheng & 
Sybesma 1999; Simpson 2005). 

Classifier is a head in the xNP, Numeral is a specifier of #P (Borer 
2005; Ouwayda 2014). 

Classifier and Numeral form a constituent (Fukui & Sakai 2000; also 
Ionin & Matushansky 2006). 

Different classifier languages have different structures depending on 


whether the classifier appears independently (Saito et al. 2008; Jenks 
2010; Hall 2015). 
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Most arguments in favour of a complement relation existing between the clas- 
sifier and the noun attempt to show that the classifier behaves as a functional 
head, and therefore that it cannot be part of a single functional unit with the nu- 
meral. This does not, however, suggest that the two cannot be a constituent. The 
only clear argument claiming that the two could not be a constituent, at least in 
MC, is proposed by Saito et al. (2008). They show that the numeral and classifier 
can float to the left in Japanese, stranding the noun (37), but that the same does 
not hold in MC (38). 


(37) a.  Taroo-wa san-satu no hon-o katta. 
Taro-roP three-cr no book-Acc bought 
"Taro bought three books: 
b. San-satu, Taroo-wa hon-o katta. 


three-cL Taro-roP book-Acc bought 


(38) a. Zhangsan mai-le san-ben shu. 
Zhangsan buy-PERF three-cr book 


"Zhangsan bought three books: 


b. * San-ben, Zhangsan mai-le shu. 
three-cı Zhangsan buy-PERF book 


They posit an adjunction structure for the numeral and classifier in Japanese, 
where they form a constituent. For MC they suggest that the classifier is a func- 
tional head which takes an NP complement, and which projects a numeral in its 
specifier. This represents the conclusion that the lack of availability of movement 
of the numeral and classifier in MC means that the numeral and classifier are not 
a constituent. This is not a particularly strong argument, however, as the lack of 
movement could just be an independent fact about the language, and this is not 
ruled out as a possibility in their paper. I therefore continue in the assumption 
that my proposal is not directly falsified by the Q-Float facts. 

Given the controversy and diverse opinions related to the constituency of the 
numeral, classifier, and noun, it is necessary to provide some further motivating 
evidence for the constituency that I propose above. Therefore, in this section, I 
present some supporting evidence for the claim that the numeral and classifier 
form a constituent to the exclusion of the noun. First, I briefly argue against the 
claim that there is a strong selectional relation between the classifier and the 
noun, and also show that some cross-linguistic evidence supports a view where 
the classifier and the numeral have a closer relation than the classifier and noun 
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(when both are present). I then move on to my main typological evidence that 
the numeral and classifier form a constituent to the exclusion of the noun, which 
involves an argument from word order: if numeral and classifier did not form a 
separate constituent from the noun then we would expect much more variation 
in word order within the noun phrase in classifier languages than we actually 
see. 


6.1 Close relationship between classifier and noun 


The main observation that I want to take into consideration here is that there 
appears to be something like a selectional or agreement relation between the 
classifier and the noun, as the following examples illustrate. 


(39) gen: classifier for thin, slender objects 
a. yi-gen xiangjiao 
one-cr banana 
‘one banana’ 
b. * yi-gen gou 
one-cr dog 


Intended: ‘one dog’ 


(40) zhi: classifier for (certain) animals 
a. * yi-zhi xiangjiao 
one-cr banana 
Intended: ‘one banana' 
b. yi-zhi gou 
one-cL dog 


'one dog' 


In (39), the classifier gen can only cooccur with a certain set of objects (namely 
those which are thin and long), and there is something of a clash when the clas- 
sifier appears with a noun from outside of that class (such as ‘dog’). ‘Dog’ has 
to appear with a different classifier, zhi, as illustrated in (40). An anonymous re- 
viewer questions how such a relationship between a classifier and a noun can 
possibly be set up in a structure such as that proposed in (35). To this I have two 
answers. First, I do not think that this “agreement” relationship necessarily has 
to do with Agree or selection or some such purely syntactic relation between two 
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heads. Rather, I think that the relationship is semantic, and results from the lex- 
ical entries for the classifiers. One illustration of this comes from an effect seen 
with some speakers where nouns can be coerced into the appropriate group un- 
der some circumstances. Two informants fully accept (40a), under a special kind 
of interpretation where the banana is assumed to be particularly cute (and possi- 
bly have pet like characteristics). I assume here that this means that perhaps the 
example should not be marked as ungrammatical, but instead as having a strong 
semantic implausibility associated with it. Further, it seems possible that classi- 
fiers are able to shift noun interpretation. Some nouns can appear with various 
different classifiers, but with different interpretations. 


(41) a. yi-bu dianhua 
one-cr telephone 

'one telephone 

b. yi-tong dianhua 
one-cL telephone 


'one phone call 


(42) a. san-zhi hua 
three-cr flower 
‘three flowers’ (long on their stalks) 
b. san-duo hua 
three-c flower 


‘three flowers’ (round, with a focus on floweryness) 


I take this to mean that the noun denotes a nebulous property which includes 
each of the different possible interpretations included in the above examples 
(‘telephone’ includes telephone objects as well as calls), and then the semantics 
of the classifier includes a presupposition that the object being counted is one of 
a particular set. 


6.1.1 Classifiers in Mi'gmaq and Chol 


Some separate supporting evidence that the numeral and classifier are more 
closely associated comes from Bale & Coon (2014).2° They note that Mi'gmaq 


?*The idea that classifiers are “for” numerals, as far as the semantics is concerned, goes back to 
Krifka (1995). 
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and Chol both have a surprising distribution of classifiers if it's assumed that the 
classifier is semantically more closely related to the noun than the numeral. The 


facts are as follows. 
In Mi'gmaq, the numerals 1-5 cannot appear with classifiers, but 6 and higher 


must. 
(43 a. na’n-ijig ji'nm-ug 
five-AGR man-PL 
b. * na'n te's-ijig ji’ nm-ug 
five CL-AGR man-PL 


‘five men’ 


a. * asugom-ijig jinm-u 
44 S 
SIX-AGR man-PL 
5 eee aed 
b. asugom te’s-ijig ji'nm-ug 
six CL-AGR man-PL 


‘six men’ 


In Chol, there is a vestigal Mayan base-20 number system: speakers only use 
Mayan numerals for 1-6, 10, 20, 40, 60 ..., and otherwise, they use Spanish loan 
numerals. What is important is that classifiers obligatorily appear with Mayan 
numerals (45), but are obligatorily absent with Spanish numerals (46): 


(45) a. ux-pej tyumuty (46) a. * nuebe-p'ej tyumuty 
three-cr egg nine-cL egg 
b *ux  tyumuty b. nuebe tyumuty 
three egg nine egg 
‘three eggs’ ‘nine eggs’ 


Note that this is true no matter what noun we use (including Spanish loan 
nouns), and no matter what classifier the numeral combines with. 

Under an account where the numeral and classifier have a closer relationship, 
these facts immediately make sense. Under a Chierchian account where the clas- 
sifier acts as an individualizer that “portions out” chunks of the mass that nouns 
denote (Chierchia 1998a), the idiosyncratic behaviour of the numerals receives 
no explanation. This provides evidence that composition of the classifier and the 
numeral is required for the numeral to then be able to compose with the noun: 
this would make sense if # and Cl form a constituent to the exclusion of the noun. 
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Of course Mi'gmaq and Chol are not related to the languages under discussion, 
but, on the assumption that there is some shared syntactic category of classifier 
in the DP of all of these languages, I take this to at least be suggestive evidence 
that there is a closer relation between the classifier and the numeral than the 
classifier and the noun. 

In the next subsection I move on to some typological evidence for this close 
relation between numeral and classifier. 


6.2 Typology 


So far we have been focusing on languages where the numeral precedes the clas- 
sifier, and the classifier precedes the noun, giving the overall order in (47), illus- 
trated with examples in (48) and (49). 


(47 #>Cl>N 


(48) liang gen xiangjiao (49) ib-tus tub.txib 
two CLthin/pole banana ONE-CLperson/animal Messenger 
“two bananas’ (MC: #>Cl>N) ‘one messenger’ (Hmong: #>Cl>N) 


Unsurprisingly, we see cross-linguistic variation in the ordering of these ele- 
ments, and there are languages where the numeral and classifier follow the noun 
(50), (51). 


(50) hon san-satsu (51) phyata Cha? 
book three-CLyound/printed mat one CLflat/thin 
‘three books’ (Japanese: N>#>Cl) “one mat’ (Burmese: N>#>Cl) 


When we look at a full typology of classifier languages, however, it becomes 
clear that the order of the numeral, classifier and noun is quite constrained. In 
Hall (2015) I discuss three word order surveys, which produce the following word 
order typology for classifier languages: 


(52) Order of numeral, classifier and noun (following Jones 1970, Greenberg 
1972, Aikhenvald 2000): 


a. #> Cl > N: very common (MC, Vietnamese, Cantonese, ...) 
b. N > #> Cl: very common (Thai, Khmer, Loniu, ...) 
C] > # > N: very rare (Ibibio only) 


£0 


N > Cl > #: very rare/maybe no languages (possibly Bodo only) 


m 


Cl > N > #: very rare (Ejagham only) 
f # >N > Cl: not attested 
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A closer look at the two extremely rare cases, i.e. Ibibio (Cl>#>N) and Ejagham 
(Cl>N>#), shows that they should in fact be removed from the typology. Ibibio 
doesn't have classifiers at all (Essien 1990). Ejagham does not have obligatory 
classifiers, and examples involving classifier-like elements discussed in Green- 
berg (1972) look more like a measure phrase (see Watters 1981 and Hall 2015 for 
discussion). If we remove these languages, then we have the following typol- 


ogy:”” 


(53) # > Cl > N: very common (MC, Vietnamese, Cantonese, ...) 


TOP 


N > # > Cl: very common (Thai, Burmese, Khmer, Loniu, ...) 
Cl > # > N: not attested 


£0 


N > Cl > #: rare (a few Bodo-Garo, Tani and Chin languages) 
e. Cl>N > #: not attested 
f£ #>N > Cl: not attested 


What is striking in this typology is that there are no attested orders where the 
numeral and the classifier are separated by the noun.?5?? It is clear that this is 
completely expected if the numeral and the classifier form a constituent to the 
exclusion of the noun, but remains mysterious if we posit the kind of structure 
proposed by Simpson (2005). In the next subsection I will explicitly show why. 


“71 have also included some additional N > Cl > # languages (Tani and Chin languages) which 
are not included in the typological studies referenced above. 

?*For completeness’ sake, I give a full list of all attested word orders in classifier languages in 
Table i. Note that the “example languages” column is not intended as an exhaustive list of all 
of the languages that exhibit that order. 


Table i: All DP internal elements 


EE, E 


EN 
o 


Word order 


Num > Cl > N > A > Dem 
N > A > Num > Cl > Dem 
Dem > N > A > Num > Cl 
Dem > Num > Cl> A- N 
Dem > Num > CI- N- A 
Num > Cl > A > N > Dem 
Dem > A > N > Num > Cl 
N > A > Dem > Num > Cl 
Dem > N > Adj > Cl > Num 
Dem > Adj > N > Cl > Num 


Example languages 


Vietnamese, Nung, Malay 
Thai, Khmer, Javanese 
Burmese, Maru 

MC, Cantonese 

Yao 

Coast Tsimshian 

Newari, Dulong 

Nuosu Yi, Lahu, Akha 
Kokborok, Apatani, Mizo 
Mising, perhaps Nishi 


?See Hall (2015: §5, especially 85.4.1) for an explanation of the absence of the Cl > 4 > N order. 
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6.3 Deriving word order variation 


Recent work on cross-linguistic variation in the relative order of DP internal 
elements has suggested that we can make sense of gaps in the typology in sys- 
tematic ways, under certain assumptions about the nature of DP internal roll-up 
movements (Cinque 1996; 2005), or with a flexible approach to the linearization 
of the unordered sets produced by Merge (Abels & Neeleman 2012). I give a brief 
summary here of the two related approaches, and then show what predictions 
they would produce with respect to word order variation in classifier languages, 
on the assumption that the classifier takes a NP complement. 


6.3.1 Cinque (2005): Universal 20 


Cinque (2005) shows that each of the 14 attested orders of Demonstrative, Nu- 
meral, Adjective and Noun can be generated, while ruling out each of the 10 
unattested orders, if the following constraints on movement operations are ap- 


plied: 


(54) a. Merge order: [... [wp Dem... [xp Num... [yp A np N]]]]] 
b. Parameters of movement 
1. No movement, or 

ii. Movement of NP plus pied-piping of the whose picture type 
(movement of [NP[XP]]), or 

iii. Movement of NP without pied-piping, or 

iv. Movement of NP plus pied-piping of the picture of who type 
(movement of [XP[NP]]). 


v. Total versus partial movement of the NP with or without 
pied-piping (either NP moves all the way up or only partially) 

vi Neither head movement nor movement of a phrase not 
containing the (overt) NP is possible. 


The first assumption of a fixed universal hierarchical order of elements in the 
DP gives us the underlying structure in Figure 1. 

Cinque assumes that modifiers are merged in the specifiers of functional heads 
in the xNP, and that antisymmetry (i.e. the LCA of Kayne 1994) rules out symmet- 
ric base generation of modifiers, meaning that all postnominal modifiers must 
be generated through movement of the NP, or some constituent containing the 
NP. Each of the elements demonstrative, numeral and adjective are taken to be 
phrasal elements which merge in the specifier of a functional head. In each case 
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Agr, P 
Agıy WP 
Edo N 
W Agr,P 
Agry XP 
"di 
X Aer, P 
Agıy YP 
K N 
X NP 


Figure 1: Proposed universal base structure of the DP from Cinque 
(2005) 


of movement, the NP, or pied-piped constituent containing the NP, moves to the 
specifier of an Agr head above the contentful phrasal element. The noun phrase 
can move to any of the Spec Agr positions (54b-iii), and can pied-pipe any con- 
stituent either in the form [NP[XP]] (54b-ii) or [XP[NP]] (54b-iv). This movement 
can be partial (to one ofthe intermediate Agr positions), or complete (all the way 
to the highest Agr projection). Through a combination of movement steps, which 
must follow the constraints in (54), each of the attested orders can be derived. 


6.3.2 Abels & Neeleman (2012) 


Abels & Neeleman (2012) argue that all of the orders that are generated by Cin- 
que's approach can in fact be produced without some of the assumptions that 
Cinque makes about phrase structure and movement. They show that a more 
constrained theory of movement, coupled with flexibility in the linearization of 
sister nodes (eschewing the LCA) generates the same results. 
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(55) a. The underlying hierarchy is Dem > Num > A > N (where > indicates 
C-command); 


b. there is cross-linguistic variation with respect to the linearization of 
sister nodes in this structure; 


c. all (relevant) movements move a subtree containing N; 
d. all movements target a c-ccommanding position; 


e. all movements are to the left. 


The idea is that, with the underlying structure shown in (56), eight different 
word orders can be generated if we assume that linearization of sisters is flexible. 


(57) Base generated orders 

a. Dem Num AN 

b NA Num Dem 

c DemNumNA 

d. AN Num Dem 
Dem A N Num 
Num N A Dem 
Dem N A Num 
Num A N Dem 


(56) 


Do rm oO 


The remaining six orders are generated through movement constrained in the 
ways noted in (55). Simply put, this approach produces the same results, but 
appeals to flexibile linearization of sisters instead of massive roll-up movement. 


6.3.3 Predictions 


For our purposes, either approach to cross-linguistic variation in word order will 
do, and I remain agnostic as to which is the preferred approach. Here we are 
trying to account for the gaps in classifier language word order typology: in 
particular, why the classifier and the numeral are never separated by the noun. 
Whether we take a roll-up movement approach following Cinque, or a flexible 
linearisation approach following Abels & Neeleman, we would expect the noun 
to be able to appear between the numeral and the classifier under any analysis of 
DP internal structure which takes the classifier to be a head taking the noun as 
a complement, and which takes the numeral to appear in a specifier or adjunct 
position above the classifier (i.e. 36b-c above). If the numeral is merged in the 
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specifier of Num, then, under the roll-up movement approach, both Cl > N > # 
(58) and # > N > Cl (59) can be generated "12 


(58) AgrP (59 | NumP 


Under the flexible linearization approach too, both Cl > N > # (60) and # > N > 
Cl (61) can be generated: 


(60) NumP (61) | NumP 
# # 
Num  CIP Num  CIP 
C] NP NP Cl 


If, on the other hand, the numeral and classifier form a constituent to the ex- 
clusion of the noun, as I have proposed, then we predict that the numeral and 
classifier should not be separated by the noun, and get the typological result for 
free. This is not a knockdown argument against an alternative, but it is some- 


3°T follow Cinque (2005) in having the specifier of an Agr head as a landing site, but have left 
out irrelevant Agr positions (i.e. Agr positions which are not the landing site of movement). 

314 reviewer points out that different assumptions about the numeral (it heads its own projection 
vs it is in a specifier of another head) would lead to different predictions about what word 
orders are possible. This is true, but under all approaches (except for where the numeral and 
classifier go together as a separate constituent) we still expect the numeral and classifier to be 
separable, with the noun intervening. 
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thing that would require explanation if we accept that the classifier takes N as 
its complement, and requires no explanation at all if Cl and # form a constituent. 


7 Conclusion 


In this paper I have argued that a traditional account of the "numeral blocking" 
effect in classifier languages, which appeals to the Head Movement Constraint, 
should be revised in light of new empirical evidence from classifier languages 
with overt number and definiteness morphology on the classifier. I have sug- 
gested that a revised account, which can capture all of the empirical facts, leads 
us to the conclusion that there must be two separate syntactic structures for #- 
CI-N phrases and CI-N phrases in these languages, and that when a numeral 
is present, the numeral and the classifier form a constituent to the exclusion of 
the noun. This conclusion is supported by typological evidence: there are no lan- 
guages attested which exhibit a DP internal word order where the classifier and 
the numeral are separated by the noun, which would be mysterious under stan- 
dard approaches to cross-linguistic word order variation in the DP, but which 
falls out naturally under the account proposed here. 
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On kinds and anaphoricity in languages 
without definite articles 


Miloje Despić 


Cornell University 


This paper investigates the availability of anaphoric readings with bare nouns in 
languages without definite articles, with a special focus on kind-level interpreta- 
tion. Various facts from Serbian, Turkish, Japanese, Mandarin, and Hindi shows 
that the anaphoric reading of bare nouns is constrained by two general factors: (i) 
number morphology; in particular, whether the language in question has number 
morphology to begin with, and if it does, whether the bare noun in question is mass 
or count, and (ii) kind interpretation. It seems that mass and plural nouns can have 
anaphoric readings only if they are not interpreted as kinds. Singular count bare 
nouns, on the other hand, do not seem to be restricted in this way: they can have 
anaphoric readings regardless of whether or not they are interpreted as kinds. I ar- 
gue that this state of affairs naturally follows from the system developed in Dayal 
(2004), which is based on a limited set of type-shifting operations and a particular 
analysis of number morphology. Alternative approaches to interpretation of bare 
nouns, on the other hand, do not seem to directly predict this sort of variation and 
require additional assumptions to account for it. 


1 Introduction 


In this paper, I explore the anaphoric definite interpretation of bare nouns in 
languages without definite articles. Evidence presented here reveals an interest- 
ing generalization about the availability of anaphoric readings with bare nouns, 
which requires an adequate explanation. In particular, it seems that the anaphoric 
interpretation of a bare noun depends on (i) whether or not the noun in ques- 
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tion is singular or mass/plural and (ii) whether or not it is interpreted as kind- 
denoting. I will present data from Serbian, Turkish, Japanese, Mandarin and Hin- 
di to illustrate this phenomenon. Before introducing the main empirical puzzle, 
it is useful to go over two major types of approaches to the structure and inter- 
pretation of NPs in languages without definite articles. 

A theoretical challenge for anyone dealing with bare nouns in languages with- 
out articles is how to formally treat the absence of the definite determiner.! On 
the one hand, there is what we may call the Universal DP Approach (UDP), on 
which DP is present in all languages, regardless of whether they have a definite 
article or not (e.g. Longobardi 1994; Cinque 1994; Scott 2002; Pereltsvaig 2007) 
etc.). The central claim of this line of research is that even article-less languages 
have a definite article (i.e. a D head) in syntax, but unlike in languages like En- 
glish, the article is unpronounced/covert. In some versions of it, a fixed layer of 
functional projections is present in the nominal domain of all languages: 


(1) Determiner > Ordinal Number > Cardinal Number > Subjective 
Comment > ?Evidential > Size > Length > Height > Speed > ?Depth > 
Width > Weight > Temperature > ?Wetness > Age > Shape > Color > 
Nationality/Origin > Material > Compound Element > NP (Scott 2002: 
114) 


The idea here is that the structure of the nominal domain of all languages is 
underlyingly identical and involves a functional spine in (1), which is very similar 
to the adverbial functional spine proposed in Cinque (1999), for example. On the 
other hand, the DP/NP approach assumes that DP is present only in languages 
with articles. In this kind of approach, the lack of (overt) articles actually indicates 
a simpler syntactic structure, i.e. NP (Baker 2003; Bošković 2008; 2012; Despić 
2011; 2013; 2015). The contrast between the two types of languages in the DP/NP 
approach is illustrated in (2). 


"This is part of a more general question of how to treat a construction/language which lacks a 
particular morpheme that is otherwise present in other constructions/languages. 
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(2 a. Languages with definite articles 
DP 


(F4 and F2: potential functional projections) 
b. Languages without definite articles 


(DP projection absent) 


(F4 and F2: potential functional projections) 


There seems to be a number of cross-linguistic (and language-specific) syn- 
tactic patterns which are strongly correlated with whether or not definiteness 
marking is overtly present (e.g. Bošković 2008). Two such generalizations are 
given in (3) (see Bošković 2008 for more): 


(3) a. Only languages without articles may allow Left Branch Extraction 
(Boskovié 2008; 2012). 

b. Reflexive possessives are available only in languages which lack 
definiteness marking, or which encode definiteness postnominally. 
Languages which have prenominal (article-like) definiteness marking, 
on the other hand, systematically lack reflexive possessives (Reuland 
2011; Despić 2015). 
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Correlations like these are expected on the DP/NP approach, since the pres- 
ence of the definite article in a language indicates a richer syntactic structure in 
the nominal domain. For example, to explain (3b), Despić (2015) proposes that 
DP is a binding domain, in contrast to NP, which is not (see Bošković 2012 and 
Despić 2015 for discussion of 3a).? Then in languages with prenominal definite 
articles, illustrated with English in (4), the reflexive possessive is not bound in 


its binding domain. 


(4) a DP 


"mE 


D PossP 


the pou ek 


*Reflexive Poss’ 


SE S 


Poss NP 


A 


b. John; likes his;/" himself 's dog. 


In languages without definite articles, on the other hand, the nominal domain 
lacks DP and a binding domain by assumption and reflexive possessives are, 
therefore, in principle ruled in. Finally, for languages with postnominal definite- 
ness marking, it can be assumed that PossP moves out of DP (as indicated by the 


?LEFT BRANCH EXTRACTION (LBE) refers to situations in which a nominal modifier can be syntac- 
tically moved/fronted to the exclusion of the noun it modifies. Bošković (2008; 2012) observes 
that LBE is possible only in languages without articles. For example, while a construction like 
(i.a) is grammatical in Serbian, an article-less language, its English counterpart is ungrammat- 


ical (see i.b). 


(i) a Serbian 
Lepe; je vidio [t; kuée]. 
beautiful is seen houses 
‘Beautiful houses, he saw: 


b. English 
“Beautiful, he saw [t; houses]. 


This strongly suggests that languages with and without definite articles have different nom- 
inal structures; e.g. while languages with articles project DP, which can block movement/LBE, 
languages without articles seem to lack this projection (i.e. their nominal structure is simpler; 
see 2b). 
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word order), which again rules in reflexive possessives. The general point is that, 
in the DP/NP approach, it is expected that at least some syntactic patterns would 
be directly sensitive to the overt presence/absence of the definite article. 

In the UDP, such correlations appear accidental, since the presence of DP in 
the syntactic structure is independent of its morpho-phonologi-cal manifestation. 
To be clear, they are not strictly incompatible with the UDP, but additional as- 
sumptions are necessary to account for them. The question is, of course, whether 
these additional assumptions would simply re-describe the facts or actually pro- 
vide true insight and be independently motivated. At the same time, one may 
wonder about the predictive power of the UDP; i.e. what kind of facts would 
ultimately be able to falsify it? 

On the semantic side, it is clear that bare nouns in languages without arti- 
cles can have definite, anaphoric readings, unlike in languages like English. The 
question is then what is responsible for the availability of this anaphoric reading, 
given that the anaphoric reading in languages like English requires the definite 
article. In the UDP, the presence of a phonologically null determiner creates this 
interpretation (e.g. Longobardi 1994). There is ultimately very little difference be- 
tween English and an article-less language like Serbian: the definite, anaphoric 
reading in both of them is created by a definite D head. The only difference is that, 
in contrast to English, D is not overtly realized in Serbian. On the other hand, ap- 
proaches that do not assume null D heads argue that a limited set of type-shifting 
operations is responsible for the general interpretation of bare nouns, including 
the anaphoric reading (e.g. Chierchia 1998; Dayal 2004). 

In this paper, I focus on anaphoric, definite readings of bare nouns in languages 
without definite articles 7 I show that their availability crucially depends on two 
factors (among other things): (i) number morphology and (ii) kind interpretation. 
Iargue that the particular cross-linguistic variation discussed here is expected in 
the system developed in Dayal (2004), which employs type-shifting operations 
and a specific view of number morphology. As discussed in 83-5, the system 
based on type-shifting operations developed in Chierchia (1998) and Dayal (2004) 
is far from being unconstrained. That is, type-shifting operations do not apply 
arbitrarily. For example, the so-called BLOCKING PRINCIPLE regulates the avail- 
ability of covert type-shifting operations by making sure that if a language has 
a lexical item whose meaning is a particular type-shifting operation, then that 
item must be used instead of the covert version. For this reason, for example, bare 
nouns in English (mass or plural) cannot have definite meaning - the covert type- 
shifting operation that would create this meaning is blocked by the existence of 


"For an overview of different aspects of the meaning of definite descriptions see Schwarz (2009) 
and references therein. 
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the overt lexical item the. Also, covert type-shifting operations that are not ex- 
cluded by the Blocking Principle are not equally available, but are rather ranked 
in terms of meaning preservation/simplicity; e.g. the operation responsible for 
kind reference ^ is more highly ranked than 3, and the latter may apply only if^ is 
undefined for some argument (see 83). Both of these principles are independently 
motivated; e.g. the Blocking Principle follows the general logic of the ELSEWHERE 
CONDITION (language particular choices win over universal tendencies). 

At the same time, the data discussed in this paper raise certain questions for 
the UDP, which seems to require extra assumptions to explain them and it is not 
clear to which extent these assumptions could be independently motivated. In 
the remainder of the paper, I will therefore focus on demonstrating how th facts 
presented in the next section follow from Dayal's (2004) proposal. 

The paper is organized as follows. In §2 I present the main empirical puzzle, 
while in 83 I show how it can be explained under Dayal’s (2004) approach. In 84 
I discuss some predictions and consequences of the data and analysis introduced 
in 82 and 83. Finally, a summary and concluding remarks are offered in 85. Here 
I also offer some thoughts on how the generalizations presented in this paper 
and Dayal (2004) can be connected to the distinction between weak and strong 
definiteness (e.g. Schwarz 2009). 


2 The puzzle: Anaphoricity and kinds 


In this section, I present the central empirical problem of the paper. Bare singular 
count nouns in languages without articles can be used anaphorically to refer to a 
previously introduced individual. Thus, the bare noun book in both Serbian (see 
5) and Turkish (see 6) can refer to Crime and Punishment in the antecedent clause. 
English, on the other hand, must use the definite article (or demonstrative) in the 
same situation. 


(5) Serbian 
Juče sam pročitao “Zločin i Kaznu” - knjiga mi 
yesterday am read Crime and Punishment  book-NoM me-DAT 
se zaista svidela. 
REFL really liked 


"Yesterday I read Crime and Punishment - I really liked the book: 


(6) Turkish 
Dün "Suc ve Ceza” okudum - kitap harikaydı. 
yesteday Crime and Punishment read-pst book terrific-PST 


"Yesterday I read Crime and Punishment. The book was terrific: 
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As shown in (7-11), similar holds for Mandarin, Japanese and Hindi, also lan- 
guages without definite articles (note that Mandarin and Japanese do not mark 
number, which will become relevant in 83 and $4). In Mandarin examples in (7), 
bare nouns shu ‘book’ and ta ‘tower’ are used to refer anaphorically to Crime 
and Punishment and Oriental Pearl, respectively. In (8), the bare noun mao ‘cat’ 
is referring to the NP in the antecedent clause. Japanese examples in (9) illus- 
trate the same point: hon ‘book’ in (9a) refers to Crime and Punishment, while 
roojin old man’ in (9b) refers to the proper name Yahachi. Examples from Hindi 
are given in (10) and (11). Now, although anaphoric readings with bare nouns 
are available in these languages, it should be noted that nouns with demonstra- 
tives or simple pronouns are preferred in many contexts, for a number of prag- 
matic and discourse reasons, which I will not discuss here. What is crucial is 
that such use of bare nouns in languages like English is disallowed regardless of 
discourse/context properties (that is, bare singular nouns are in general ungram- 
matical in English). 


(7) Mandarin 


a. Wokan le Zuiyufa Shu zai zhuo zi-shang. 
I read asp Crime and Punishment book be at — table-roP 


‘Tread Crime and Punishment. The book is on the table? 


b. Wocanguanle dongfangmingzhu. Ta hen gao. 
I visit PTCP Oriental Pearl tower very tall 


‘I visited the Oriental Pearl. The tower is high: 


(8) Mandarin 
Wo kanjian yi-zhi | mao. Mao zai huayuan-li. 
I see one-CLF cat cat at garden-inside 


‘I see a cat. The cat is in the garden’ (Dayal 2004: 403) 


(9) Japanese 


a. Kinou “Tsumi to Batsu”-o yonda. Hon-wa 
yesterday Crime and Punishment-Acc read-pst book-TOP 
subarashikatta. 
fantastic-PST 
"Yesterday I read Crime and Punishment. The book was fantastic? 

b. Yahachi-o miru-to, roojin-wa ` damatte unazuita. 
Yahachi-Acc see-when old man-ToP silently nodded 


"When I saw Yahachi, the old man silently nodded. (Fujisawa 1992: 14) 


265 


Miloje Despić 


(10) 


(11) 


Hindi 
Kal mei-ne Crime and Punishment pari aur kitaab bariya hai. 
yesterday I-ERG Crime and Punishment read and book excellent is 


"Yesterday I read Crime and Punishment and the book is excellent: 


Hindi 
Kuch bacce andar aaye. Bacce bahut khush the. 
some children inside came children very happy were 


"Some children came in. The children were very happy! (Dayal 2004: 403) 


Consider now bare mass nouns. When they are used in a kind-denoting con- 
text they cannot be used anaphorically in these languages. For example, meyve 
‘fruit’ in (12) cannot pick out üzüm ‘grapes’ in the antecedent clause, just like 
voce ‘fruit’ cannot refer to grozde ‘grapes’ in (13). They only have the implausible 
general meaning - the second clause in these examples can be interpreted only 
as a statement about fruit in general, not about a particular kind of fruit (grape) 
introduced in the antecedent clause. 


(12) 


(13) 
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Turkish 

Omrüm boyunca üzüm yetistirdim. *(Bu) meyve herseyim 

my life throughout grape produce this fruit my everything 
oldu. 


became 


'I have been producing grapes my whole life. (This) fruit is everything to 
me. 
— “if meyve ‘fruit’ is anteceded by üzüm ‘grapes’ 


— OK if bu meyve ‘that fruit’ is anteceded by üzüm ‘grapes’ 


Serbian 


a. Nase mesto veé generacijama proizvodi belo grožđe. Sve 
our town already generations produces white grape everything 
dugujemo #(tom) voću. 
owe (that) fruit-DAT 
‘Our town has been producing white grapes for generations. We owe 
everything to (that) fruit? 
— “if voću ‘fruit’ is anteceded by grožđe ‘grapes’ 
— OK if tom voću ‘that fruit’ is anteceded by grožđe ‘grapes’ 
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b. .. (To) voće je jako ukusno. 
that fruit is very tasty 
^... (That) fruit is very tasty: 
— “if voce ‘fruit’ is anteceded by grožđe ‘grapes’ 
— OK if to voce ‘that fruit’ is anteceded by grozde ‘grapes’ 


In order to get the anaphoric reading, a demonstrative must be used. These 
examples are minimally different from those in (5-6), which in contrast do allow 
anaphoric interpretation of the bare noun. Also note that whether voce ‘fruit’ in 
Serbian is in the subject or object position is irrelevant for anaphoricity.*” 

We see a similar pattern in Mandarin, Japanese and Hindi, as illustrated with 
some examples below. All of my informants find a strong contrast in the avail- 
ability of anaphoric reading between examples (7-11), on the one hand, and the 
ones in (12-16), on the other. Just like in (12-13), the second clause in (14-16) be- 
low can be interpreted only as a general statement about fruit, not as a statement 
about a particular kind of fruit mentioned in the antecedent clause; i.e. ‘Fruit is 
our life’ in (14) cannot be interpreted as 'Apples are our life’. 


(14) Mandarin 
Women shidai zhong pingguo shuiguo jiu shi women de ming. 
we generation grow apple fruit PTCPis we GEN life 


"We have been growing apples for generations. Fruit is our life! 


“Turkish, however, has differential object marking and in accusative case makes a morphologi- 
cal distinction between specific and non-specific objects (e.g. Enc 1991). 

*Other mass nouns behave in a similar way; e.g. vino ‘wine’ in (i.b) below cannot be anteceded 
by Vranac (a special type of wine) in (i.a) without the demonstrative. Both voce ‘fruit’ and 
vino ^wine' in Serbian in general require a classifier phrase (like truckload of or glass of) or a 
measure phrase (like lot of) for counting, which is typical of mass nouns. At the same time, 
they are very useful here because they have well-established subclasses/subtypes (in contrast 
to, say, sand), which could in principle serve as pragmatically plausible antecedents. The fact 
that the anaphoric relationship cannot be formed in these examples, thus, cannot be due to 
pragmatic factors. 


(i) Serbian 
a. Nase mesto već generacijama proizvodi “Vranac”. 
our town already generations sproduces Vranac 
*Our town has been producing Vranac for generations: 


b. Sve dugujemo #(tom) vinu. 
everything owe (that) wine 


"We owe everything to (that) wine: 
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(15) Japanese 
Watashitachi-wa daidai budou-o sodatetekita. #(Kono) 
We-TOP for-generations grapes-Acc have grown this 
Kudamono-wa subarashi. 
fruit-ToP fantastic 


"We have been growing grape for generations. This fruit is fantastic: 


(16) Hindi 
Mei-ne angur ki kheti mei saari jeevan biaayi hai aur #(ye) phal-ne 
I-ERG grapes offarmingin all life ^ spend is and this fruit-ERG 
mujh-ko ameer bana dija hai. 
me-Acc rich make-PST give-PST is 


' have been growing grapes all my life and the fruit has made me rich’ 


Now, a mass noun with a kind reading can be used anaphorically in English, if 
it is accompanied by the definite article. Consider, for instance, (17) in which ‘the 
fruit’ is anteceded by ‘grapes’. Many speakers I have consulted find the anaphoric 
reading in (17) perfectly possible, although some of them would still prefer the 
demonstrative ‘that’ instead of ‘the’, presumably for the same type of reasons 
mentioned in the discussion of (5-11).°7 


(17) We have been growing grapes for generations — and you know, we have 
made millions on the fruit. 


Why would this be the case? Why would the existence of kind-reference affect 
the anaphoric potential of a bare noun in article-less languages in such a way? 
This state of affairs seems to raise some non-trivial questions for the basic version 


"What seems to be clear is that the bare noun fruit in (i) has no anaphoric potential; i.e. the 
second clause in (i) is interpreted as a general statement about fruit, which is exactly the kind 
of judgment speakers of languages without articles discussed here have for (12-16). 


(i) We have been growing grapes for generations — and you know, we have made millions on 
fruit. 


"Similar facts about anaphoricity of mass nouns interpreted as kinds have also been observed 
by Dayal (2004: ft. 43, 435-436), who points out that “...mass terms can occur with a definite 
if anaphorically linked to an antecedent, even if such anaphoricity leads to kind reference, as 
in (i)? 

(i) Patients need medicine and food. (The) medicine fights the disease and (the) food builds up 
strength. 


See §5 for a discussion of kinds in connection with the distinction between unique and familiar 
definites. 


268 


8 On kinds and anaphoricity in languages without definite articles 


of the UDP approach. In particular, if the covert version of the definite article, 
which is overt in English, is responsible for the definite reading of the bare nouns 
in (5-11) (e.g. knjiga ‘book’), why cannot it produce the same effect in (12-16) 
(with the bare noun grožđe ‘fruit’) given that ‘the fruit’ in English (17) has the 
definite article? In the UDP all languages have identical underlying structure in 
the nominal domain, and the phonologically null/covert D in Serbian or Turkish 
should in principle perform the same function as its overt version in languages 
like English; e.g. it assigns the definite/anaphoric interpretation to, say, knjiga 
or kitap ‘book’ in (5-6), just like the overt article the does in English. One could 
assume that, for some reason, covert versions of D are more limited in meaning, 
and cannot combine with, for instance, kind-denoting nouns, but this would have 
to be independently supported. That is, these additional assumptions would have 
to explain why the opposite situation does not arise. 

Note that the real culprit here is the presence of kind-reference. In other words, 
bare mass nouns in languages without definite articles can have anaphoric read- 
ings in the absence of kind interpretation. This is shown in (18-22): in all of these 
examples the antecedent clause describes a particular object-level entity, and the 
bare mass nouns in the second clause (‘fruit or ‘wine’) can be anaphorically an- 
teceded by it. This is true even though these examples are overall very similar 
to those in (12-16) - the only difference is that the latter force the kind-level in- 
terpretation. That is, bare mass nouns can have both kind-level and object-level 
interpretation, but the anaphoric reading is possible only in the latter case (see 
Chierchia 1998: $4 and references therein) for the kind vs. object level distinc- 
tion). Compare (18a-b) with (13), for instance. As discussed in Chierchia (1998), 
from an intuitive, pretheoretical point of view, kinds are seen as regularities that 
occur in nature - although they are similar to individuals, “their spatiotempo- 
ral manifestations are typically “discontinuous”” (Chierchia 1998: 348). That is, a 
kind can be identified in any given world with the totality or sum of its instances. 
It may lack instances in a world/situation (e.g. dodo), but something that is neces- 
sarily instantiated by just one individual (e.g. Noam Chomsky), would not qualify 
as a kind (this contrast will in fact play one of the central roles in the explanation 
offered in the next section). So in (13), for example, we interpret the mass noun 
as an idealized sum of its instances with discontinuous spatiotemporal manifes- 
tations, which is highlighted by the use of the expression ‘for generations’ - we 
clearly do not interpret it as a particular object-level instantiation of the mass 
noun (e.g. a bowl of fruit). In (18b), on the other hand, we have exactly that - a 
specific, object-level interpretation of the mass noun, with a specific quantity, at 
a specific time/situation. And exactly in this case the anaphoric relationship can 


be established. 
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Also, as in the case of examples in (5-11), an NP with a demonstrative or a 
simple pronoun might be preferred in (18-22), but the bare noun is nevertheless 
quite possible. What is important is that there is a substantial contrast between 
this set of examples and those in (12-16), in which the anaphoric reading is not 
available without the demonstrative. 


(18) Serbian 

a. Juče sam po prvi put pojeo nekoliko braziliskih papaja. Voće je 
yesterday am at first time ate afew Brazilian papaya fruit is 
zaista fantastično! 
truly fantastic 
‘Yesterday I ate a few Brazilian papayas for the first time. The fruit is 
fantastic! 

b. Danas sam kupio malo grožđa, hleb i mleko. Voće sam stavio un 
today am bought bit grapes bread and milk fruitam put in 
frižidera sve ostalo na sto. 
fridge andall else on table 
‘Today I bought some grapes, bread and milk. I put the fruit in the 
fridge and the rest on the table? 

— OK if voce 'fruit' is anteceded by grozde 'grapes' 

c. Sa prijateljima sam juče popio tri flaše Dom Perinjon-a. 
with friends am yesterday drank three bottles Dom Perignon 
Vino je zaista fantastično. 
wine is truly fantastic 
'I drank three bottles of Dom Pérignon yesterday. The wine is truly 
fantastic: 

— OK if vino ‘wine’ is anteceded by Dom Pérignon 


The examples below behave the same way: 


(19 Turkish 
Dün üzüm, peynir ve süt aldım. Meyve pahalıydı ama 
yesterday grape cheese and milk buy-1.Psr fruit expensive-PsT but 
diğerleri hesapliydi. 
rest affordable-pst 
‘I bought grapes, cheese and milk yesterday. The fruit was expensive but 
the rest was affordable’ 
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(20) Mandarin 


a. Wobana dai | pinguo fang dao zhuozi-shang, danshi shuiguo 
I bathat packet apple put towards table-roP but fruit 
yixia zi jiu diao-chulai le. 
all-of-a-sudden PTCP fall-out ^ Asp 
‘I put the packet with apples on the table, but the fruit immediately 
fell out of it’ 


b. Womai le san ge pingguoniunaihe baozhi shuiguo hen 
I bought asp three ctf apple milk and newspaper fruit very 
gui, qita dongxi dou hen pianyi. 


expensive other things all very cheap 


‘I bought three apples, milk and newspapers. The fruit was expensive; 
the other things were cheap"? 


(21) Japanese 
a. Tana-no ue-no  ringo-o miruto, kudamono-wa sudeni kusatte 
shelf-GEN top-GEN apple-Acc saw time fruit-TOP already rotten 
ita. 
was 
“When I saw the apple on the shelf, the fruit was already rotten’ 


b. Kinou budou to chiizu to gyuunyuu-o katta. Kudamono-wa 
yesterday grape and cheese and milk-acc bought fruit-ToP 
teeburu-ni oite, hoka-wa reizouku-ni ireta. 


table-at put-and rest-ToP fridge-in insert-PST 


"Yesterday I bought grapes, cheese and milk. I put the fruit on the 
table and the rest in the fridge: 


*Contrastive particle jiu before ‘fruit’ in (20b) makes the anaphoric relation clearer, but it is 
not necessary - (20b) is fine without it. Also, Jenks (to appear) observes that Mandarin seems 
to make a principled distinction between unique and anaphoric definites (e.g. Schwarz 2009); 
while unique definites are realized as bare nouns, anaphoric definites are realized with a demon- 
strative, except in subject positions, where bare nouns can also be interpreted anaphorically. 
For this reason, in all Mandarin examples in this paper bare nouns are located in subject 


positions. 
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(22) Hindi 
Aaj  mei-ne angur, dudh, aur paneer kharidi aur phal mehenga tha 
today I-ERG grapes milk and cheese bought and fruit expensive was 
par baki sab theek-thak tha. 
but rest all okay was 


‘I bought grapes, milk, and cheese today and the fruit was expensive but 
the rest was okay: 


I argue in the next section that this contrast follows from Dayal's (2004) ap- 
proach. 


3 Solution: Dayal (2004) 


Dayal's (2004) work is based on Chierchia (1998) and Carlson (1977), who take 
English bare plurals to refer to kinds (as opposed to Wilkinson 1991; Diesing 1992; 
Krifka & Gerstner-Link 1993; Kratzer 1995, who take bare plurals as ambiguous 
between kind terms and indefinites). Chierchia (1998), in particular, attempts to 
derive the typology and distribution of bare nominals across different types of 
languages. Chierchia (1998) focuses on two parameters: (i) presence vs. absence of 
determiners, and (ii) presence vs. absence of number morphology. Dayal (2004) 
modifies Chierchia's (1998) theory, most importantly in the way languages with 
number morphology but without determiners should be analyzed (see $4), but 
many core assumptions are adopted from Chierchia (1998). I will provide a brief 
overview of two assumptions of Chierchia’s (1998) system that are most impor- 
tant for the purposes of this paper. The first assumption is that languages may 
employ a number of type-shifting operations, a subset of which is given in (23): 


(23) a. (et = (", 1,4) = (oe Dt 


b. i AP ix[Ps(x)] 
eo APAs ix[Ps(x)] 
d. 3: APAQax[Ps(x) > Qs(x)] 


(Dayal 2004: 413) 


The main idea is that English bare plurals are derived via a nominalization 
operation (‘down’) ^ , defined as in (23c) (like other common nouns, they start 
life as type <s, <e, £))). ^ is a function from properties to functions from situations 
to the maximal entity that satisfies that property in that situation. The function 
is partial in that it requires the kind term to pick out distinct maximal individuals 
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across situations, thereby capturing the inherently intensional nature ofthe term. 
As shown in (24), this term can be a direct argument of a kind-level predicate: 


(24) Dodos are extinct. 


In object-level contexts, however, further operations (see 25a) come into play 
to repair the sort mismatch. This repair (DERIVED KIND PREDICATION — DKP; see 
Chierchia 1998: 364, Dayal 2004: 399) involves the introduction of existential 
quantification over the instantiations of the kind in a given situation. It draws 
on the inverse of ^, the predicativizer or ‘up’, operation " (see 25b) to take kinds 
and return their instantiation sets in a given situation: 


(25) a. DKP:If P applies to objects and k denotes a kind, then 
P(k) = 3x[V k(x) ^ P(x)] 
b. " : Aki Ax[x < k] 
c. Dogs didn’t bark = “bark("dogs) = DKP = —3x(["^dogs(x) ^ bark(x)] 


The source of existential quantification over instances of the kind in episodic 
sentences is an automatic, local adjustment triggered by a type mismatch. Bare 
plurals are in many ways different from indefinite singulars (e.g. Carlson 1977), 
for instance in scope: 


(26) a John didn't read a book. -3and 34 
b. John didn't read books. only: ^3 


The indefinite denotes a generalized quantifier, and it can therefore take wide 
or narrow scope with respect to negation, as shown in (26a). The bare plural, on 
the other hand, is a kind term, which is a direct argument of the predicate (see 
25c). Thus, whenever a kind (in an episodic frame) fills an object-level slot, the 
type of the element in question is automatically adjusted by introducing a local 
existential quantification over instances of the kind. The existential introduced 
by DKP therefore necessarily takes scope below negation. One prediction of this 
system is that non-kind denoting bare plurals should behave like regular existen- 
tially quantified NPs. For instance, they could take different scope with respect to 
negation: this prediction appears to be borne out (Carlson 1977; Chierchia 1998): 


(27) a. * Parts of this machine are widespread. 
b. John didn't see parts of this machine. -3 and 3^ 


(Dayal 2004: 419) 
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Parts of this machine in (27a) is not compatible with true kind predication, 
presumably because the definite inside the NP would force the extension of the 
noun phrase to be constant across worlds. But, as shown in (27b), this bare plural 
can now interact with negation, a diagnostic that separates indefinites from kind 
terms. Compare then (27) to (28): 


(28) a. Spots on the floor are a common sight. 
b. John didn't see spots on the floor. only: ^3 


In (28), possibility of kind reference results in the loss of scope interaction. The 
bare plural spots on the floor in (28a) is compatible with the kind-level predicate, 
which indicates that it has a kind reference. As a result, it can only have the low 
scope in (28b). Thus, this sort of system neatly explains this state of affairs. What 
needs to be assumed then is that ^ (see 23c) should apply whenever it can; i.e. it 
should take precedence over 3 (see 23d). In (27b) " is unavailable, and therefore 3 
applies, as confirmed by the scope ambiguity. Chierchia (1998) thus ranks ^ above 
3 arguing that the former is simpler, since it does not introduce quantificational 
force (see 29). 


(29) Meaning Preservation: ^ > {1,3} (Dayal 2004: 419) 


The immediate question that arises here concerns the availability of ı. In partic- 
ular, if ^ is not available in (27) and i (see 23b) is an available type-shifting opera- 
tion, why cannot parts of this machine be interpreted as definite? This brings us 
to the second important component of the Chierchia (1998)/Dayal (2004) system 
called BLOCKING PRINCIPLE, which is given in (30): 


(30) Blocking Principle (Type Shifting as Last Resort) 
For any type-shifting operation d and any X: *&(X) if there is a 
determiner D such that for any set X in its domain, D(X) = ¢(X). (Dayal 
2004: 216) 


The intuition behind this principle is that for considerations of economy lexical 
items must be exploited to the fullest before covert type-shifting operations can 
be used. So, since English has the, which is the lexical version of 1, it will always 
block ı. Thus, in English, bare plurals can avail of ^ (or 3 when " is blocked for 
independent reasons, as in 27b), but not 1, because of the presence of the lexical 
determiner the. This in turn also explains the following contrast between Hindi 
(a determiner-less language) and English (Dayal 2004: 417): 
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(31) a English 
Some children came in. *(The) children were happy. 
b. Hindi 
Kuch bacce; aaye. Bacce; bahut khush lage. 
some children came children very happy seemed 


‘Some children came. The children seemed very happy; 


While bare nouns in Hindi can be used anaphorically, as shown in (31b), this 
is not possible in English (see 31a). This is because there is no lexical definite 
determiner in Hindi, which makes ı as well as ^ available options for bare nom- 
inals. For this reason, bacce ‘children’ in (31b) can be interpreted as definite. In 
English, on the other hand, bare plurals can avail of ^ but not ı. " is a function 
whose extension varies from situation to situation, while ı is a constant function 
to a contextually anchored entity. Thus, the bare noun children in (31a) cannot 
be interpreted as definite/anaphorically. In other words, the underlying assump- 
tion of Chierchia (1998) and Dayal (2004) about ^ is that it manufactures a kind 
out of a property (i.e. an intensional entity) by taking the largest member of its 
extension at any given world; it creates a saturated object with concrete, but 
possibly spatiotemporally discontinuous manifestations. But ^ cannot establish 
an anaphoric relationship with a contextually anchored entity. Only 1, which se- 
lects the greatest element from the extension of the predicate, can do this. That 
is, even though ^ (nom) is simply an intensional counterpart of ı, “...nom can- 
not be used referentially” (Dayal 2011: 1103). In 85 I offer some remarks on how 
Dayal’s (2004) typological observations about the relationship between ^ and 1 
relate to Schwarz's (2009; 2013) typology of definiteness marking (i.e. strong vs. 
weak definite articles). 

Now, since in Dayal (2004) mass kinds are treated on a par with plural kinds, 
we have the solution to the puzzle introduced in 82. Recall first that a bare sin- 
gular noun in an article-less language like Serbian can be interpreted as definite. 
This is expected: ı is allowed, since there is no lexical article to block it. This is 
illustrated by (5), repeated below as (32): 


(32) Serbian 
Juce sam pročitao Zločin i Kaznu - knjiga mi se 
yesterday am read Crime and Punishment book-NOM me REFL 
zaista svidela. 


really liked 
"Yesterday I read Crime and Punishment - I really liked the book. 
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However, a bare mass noun in a kind-denoting context cannot be interpreted 
as definite in language like Serbian, as shown in (33) (=13a) below. 


(33) Serbian 
Naše mesto već generacijama proizvodi belo grožđe. Sve 
our town already generations produces white grape everything 
dugujemo #(tom) voću. 
owe (that) fruit 
‘Our town has been producing white grapes for generations. We owe 
everything to (that) fruit’ 
— “if voću ‘fruit’ is anteceded by grožđe ‘grapes’ 
— OK if tom voću ‘that fruit’ is anteceded by grožđe ‘grapes’ 


This is exactly expected on this approach since kind-denoting terms must be 
derived via ^; thus, the bare noun voée ‘fruit’ in (33) behaves similarly to the bare 
noun children in (31a) with respect to anaphoricity/definiteness. But bare mass 
nouns which do not denote kinds can avail of ı in languages like Serbian, because 
there is no lexical determiner to block it. Therefore they can be interpreted as 
definite, as illustrated in (34) (218b): 


(34) Serbian 
Danas sam kupio malo grožđa, hleb i mleko. Voce sam stavio un 
today am boughtbit grapes bread and milk fruitam put in 
frižidera ` sve ostalo na sto. 
fridge and all else on table 
"Today I bought some grapes, bread and milk. I put the fruit in the fridge 
and the rest on the table. 
— OK if voce ‘fruit’ is anteceded by grozde ‘grapes’ 


Dayal's (2004) approach also makes some interesting predictions about the 
availability of definite interpretations for bare singular and plural (i.e. non-mass) 
kinds in languages without determiners. I discuss these predictions in $4 and 
show that they are borne out. 


4 Predictions and consequences 


An important observation about languages with number marking but no deter- 
miners, which is central to Dayal's (2004) modification of Chierchia's (1998) sys- 
tem, is that bare plurals in such languages behave more or less like English bare 
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plurals, but bare singulars are substantially different. Although bare singulars 
and bare plurals in such languages allow for kind as well as anaphoric readings, 
their existential reading, however, is distinct from that of regular indefinites in 
two respects: (i) they cannot take wide scope over negation or other operators, 
and (ii) they cannot refer non-maximally. Thus, bare NPs cannot be used in trans- 
lating (35b) or (35c) to refer to a subset of the children mentioned in (35a) (Dayal 
2011: 1100): 


(35) a There were several children in the park. 
b. A child was sitting on the bench and another was standing near him. 


c. Some children were sitting on the bench, and others were standing 
nearby. 


So, even though there are no definite or indefinite determiners in these lan- 
guages, only readings associated with definites are available to bare NPs. Dayal 
argues that this shows that the availability of covert type shifts is constrained, as 
proposed by Chierchia (1998), but that the correct ranking is as in (36) not (29) 
(note that both ^ and ı are simpler than 3): 


36) Revised Meaning Preservation: {", ı! > 3 (Dayal 2004: 219 
g y 


This is also motivated by the fact that the Hindi version of 27b (i.e. 37b) does 
not allow a wide scope reading of parts of this machine, even though this bare 
plural is not compatible with true kind predication, as shown in (372). 


(37) Hindi 
a. "Is mashin ke TukRe aam haiN. 
this machine of parts common are 


‘Parts of this machine are common: 


b. Anu-ne is  mashiin ke TukRe nahiiN dekhe. 
Anu-ERG this machine of parts not see 


‘Anu didn't see any/the parts of this machine: 
(Dayal 2004: 420) 


Thus, given the revised ranking in (36), in the absence of ^ , the availability of 
ı blocks 3. What one might take to be the frozen existential reading in (37b) is, 


277 


Miloje Despić 


in fact, the (non-familiar) definite reading of a sentence with negation.? Dayal 
(2004) also observes that bare singulars are not trivial variants of bare plurals in 
languages like Hindi, and that these languages raise important questions about 
the connection between singular number and kind reference. For example, the 
Hindi example in (38a) has only the implausible reading whereby the same child 
is assumed to be playing everywhere. Its plural counterpart in (38b), however, 
readily allows for a plausible reading: 


(38) Hindi 


a. # CaaroN taraf bacca khel rahaa thaa. 
four ways child was playing 


‘The (same) child was playing everywhere? 


b. CaroN taraf bacce khel rahe the. 
four ways children were playing 


‘Children (different ones) were playing everywhere? (Dayal 2004: 
406) 


In order to explain this contrast, Dayal argues that singular and plural kind 
terms differ in the way they relate to their instantiations, as illustrated by the 
following quote: 


An analogy can be drawn with ordinary sum individuals the players whose 
atomic parts are available for predication, and collective nouns or groups 
like the team which are closed in this respect: The players live in different 
cities vs. *The team lives in different cities (Barker 1992; Schwarzschild 1996). 


?It seems rather clear that bare NPs in languages like Hindi are not true indefinites, but there 
are cases for which the most natural translation into English uses an indefinite (Dayal 2011: 
1101): 


(i) Hindi 
Lagtaa hai kamre meN cuhaa hai. 
seems be room in mouse be 


"Ihere seems to be a mouse in the room? 


Dayal argues that covert and overt type shifts agree on semantic operations but not on presup- 
positions. So, English article the encodes the operation 1, which Hindi bare NPs use to shift to 
type (& covertly. Both of these variants entail maximality/uniqueness. In addition, the lexical 
definite article the has a familiarity requirement that Hindi bare NPs do not. The assumption 
is that familiarity presuppositions are attached to lexical items, and that a language that does 
not have a lexical definite determiner will not enforce familiarity presuppositions. This non- 
familiar maximal reading can then be confused with a true existential reading (see also Heim 
2011). 
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^ applies only to plural nouns and yields a kind term that allows seman- 


tic access to its instantiations, analogously to sums. A singular kind term 
restricts such access and is analogous to collective nouns. (Dayal 2011: 1100) 


Thus, ^ is taken to be undefined for singular terms, which makes a prediction 
and raises a question. The prediction is that in article-less languages without 
singular-plural distinction (e.g. Mandarin) a sentence like (38a) should be fine. 
This is because a language that does not mark number on kind terms should 
not impose any constraints on the size accessibility of their instantiation sets, 
effectively aligning it with bare plurals. The prediction is borne out: 


(39) Mandarin 
Gou zai meigeren-de houyuan-li jiao. 
dog at everyone-PTCP backyard-inside bark 


‘Dogs (different ones) are barking in everyone's backyard. (Dayal 2004: 
413) 


The question is how to characterize singular kind formation. Dayal argues that 
in these cases, the common noun has a taxonomic reading and denotes a set of 
taxonomic kinds. It can then combine with any determiner and yield the relevant 
reading. 


(40) a. Every dinosaur is extinct. 


b. The dinosaur is extinct. 


In (40a), the presupposition that every ranges over a plural domain is satisfied 
ifthe quantificational domain is the set of sub-kinds of dinosaurs. The uniqueness 
requirement of the with a singular noun in (40b) is satisfied if the quantificational 
domain is the set of sub-kinds of animals. There is, therefore, nothing special 
about the definite article in definite singular kinds like (41), according to Dayal. 
The definite singular generic is derived compositionally from the regular definite 
determiner plus a common noun under its taxonomic guise: 


(41) The lion comes in several varietis, the African lion, the Asian lion ... 


Specifically, in the case of kind formation out of singular nouns, there is a 
clash between singular morphology and plurality associated with kinds, which 
is repaired as in (42), where X ranges over entities in the taxonomic domain. (42) 
then forces the application of ı, which in English comes out/is lexicalized as the. 


(42) PredK("lion =*"(SING) = PredK (1X [LION(X)]) (Dayal 2004: 435) 
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At the same time, mass kinds must be bare in English (43), which is expected 
given that ^ is defined for them. Mass kinds thus behave like plural kinds. 


(43) (The) wine comes in several varieties, (the) red wine, (*the) white wine and 
(the) rose. 


We expect then that plural kinds and singular kinds in English should differ in 
their ability to be interpreted as definite, i.e. only the latter could be interpreted 
anaphorically. This is because in the case of singular kinds ^ cannot apply (it 
clashes with the singular number morphology), and the (lexical realization of 1 in 
English) is introduced via (38). This appears to be true, as the contrast between 
(44) and (45) illustrates. The definite singular the bird can be anteceded by the 
dodo in (45), while establishing the anaphoric relationship between bare plurals 
birds and dodos in (44) does not seem to be possible. 


(44) Only dodos and gorillas survived on the continent. 
After the humans arrived birds were wiped out. 
— ?* if birds is anteceded by dodos 


(45) Only the dodo and the gorilla survived on the continent. 
After the humans arrived the bird was wiped out. 
— OK if the bird is anteceded by the dodo 


Crucially, the same kind of contrast should in principle appear in article-less 
languages with number morphology. ° should not be defined for singular terms, 
and i should be available for them via (42) - thus, the definite/anaphoric interpre- 
tation should be available for singular kinds in languages without articles. How- 
ever, since ^ is defined for plural kinds, they should pattern with mass kinds 
in terms of the availability of definite interpretation; i.e. they should lack the 
anaphoric interpretation. I believe that the following contrasts from Serbian and 
Turkish are clear enough to confirm this prediction. For example, Serbian exam- 
ples in (46) and (47) differ only in terms of number. However, there is a noticeable 
contrast between them in the availability of anaphoric interpretation, similar to 


280 


8 On kinds and anaphoricity in languages without definite articles 


(44-45). Turkish examples in (48-51) illustrate the same point Ji 


1 As indicated in the translation of (47), the object here can be modified with the expression 
‘as a kind’, which shows that what we are dealing with here is not an object-level but a kind- 
level expression. This is true for previous examples involving kind reference as well. Also, the 
object in (46) can be replaced with ‘the kind of bird known as ‘bald eagle" (e.g. My whole life, I 
have been studying the kind of bird known as bald eagle). Similar can be done to other relevant 
examples. Moreover, one can dedicate one's entire career to studying the work of Abraham 
Lincoln, and use (i.a) to express that, but ‘as a kind’ cannot modify the object in this particular 
case; e.g. (i.b) is clearly more marked than (i.c). This follows from the fact that something that 
is necessarily instantiated by just one individual (Abraham Lincoln) does not qualify as a kind. 
All of this shows that these examples truly involve kind reference. 


(i) a. Ihave been studying Abraham Lincoln my whole life. 
b. &Ihave been studying Abraham Lincoln, as a kind, my whole life. 
c. Ihave been studying the bald eagle, as a kind, my whole life. 


"Recall that due to the Blocking Principle, ı is never available for bare nouns in English, singular 
or plural (the existence of the definite article blocks it); for this reason, bare nouns can never 
be interpreted anaphorically in English. On the other hand, ı is in principle available to both 
singular and plural bare nouns in languages like Serbian and Turkish. In the case of bare plurals, 
both ^ and ı are available depending on whether the noun in question has a kind or object-level 
interpretation, respectively. In such languages, the context and the type of predicate could play 
a crucial role: a kind-selecting predicate (rare, widespread, extinct...) could, for instance, make 
the contrast clearer for some speakers; compare (i-ii) with (46-47) respectively. In general, it 
is not unexpected that this contrast would be somewhat subtler in languages like Serbian or 
Turkish than in English. 


(i) Serbian 
Ceo život proučavam beloglavog ^ orla — na žalost, pre deset godina ptica 
whole life study-1.PRs white-headed eagle unfortunately before ten years bird 
je istrebljena. 
is exterminated 


‘I have been studying the bald eagle my whole life. Unfortunately, ten years ago the 


bird was exterminated. 
— OK if ptica ‘bird’ is anteceded by beloglavog orla "bald eagle’ 


(ii) Serbian 
Ceo život proučavam beloglave orlove — na Zalost, pre deset godina ptice 
whole life study-1.PRs white-headed eagles unfortunately before ten years birds 
su istrebljene. 
are exterminated 
‘I have been studying bald eagles my whole life. Unfortunately, ten years ago birds 
were exterminated’ 
— ?* if ptice ‘birds’ is anteceded by beloglave orlove ‘bald eagles’ 
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(46) 


(47) 


(48) 


(49) 
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Serbian (singular) 

Ceo život proučavam beloglavog orla - ptica je fantastična. 

Whole life study-prs white-headed eagle bird is fantastic 

‘I have been studying the bald eagle (as a kind) my whole life. The bird is 
fantastic: 

— OK if ptica ‘bird’ is anteceded by beloglavog orla ‘bald eagle’ 


Serbian (plural) 

Ceo život proučavam beloglave orlove — ptice su fantastične. 
Whole life study-PRs white-headed eagles birds are fantastic 

‘I have been studying bald eagles (as a kind) my whole life. Birds are 
fantastic: 

— ?* if ptice ‘birds’ is anteceded by beloglave orlove ‘bald eagles’ 


Turkish (singular) 

Kel kartal, Kuzey Amerika doa bulunur. Güç ve hız-ın 

bald eagle North America-Loc is found strength and speed-GEN 
sembolü olarak tanınır. Ancak, küreselısınma nedeniyle, kus 
symbol as recognized however global warming because bird 
yakında tamamen yok olabilir. 

soon completely may disappear 

"Ihe bald eagle is found in North America. It is the symbol of strength 
and speed. However, because of the global warming, the bird may soon 
completely disappear: 

— OK? if kus ‘bird’ is anteceded by kel kartal ‘bald eagle’ 


Turkish (plural) 

Kel kartallar, Kuzey Amerika dag bulunurlar. Güç ve hız-ın 
bald eagles North America-Loc are found strength and speed-GEN 
sembolü olarak tanınırlar. Ancak, küreselısınma nedeniyle, kuşlar 
symbol as recognized however global warming because birds 
yakında tamamen yok olabilir. 

soon completely may disappear 

‘Bald eagles are found in North America. They are the symbol of strength 
and speed. However, because of the global warming, birds may soon 
completely disappear’ 

— * if kuşlar ‘birds’ is anteceded by kel kartallar ‘bald eagles’ 
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(50) Turkish (singular) 
Kel kartal, Kuzey Amerika da bulunur. Güç ve hız-ın sembolü 
bald eagle North America-Loc is found strength and speed-GEN 
olarak tanınır. Ayerica, kuşun gözleri oldukça keskindir. 
symbol as recognized also  bird-GEN eyes quite sharp 


"Ihe bald eagle is found in North America. It is the symbol of strength 
and speed. Also, the bird’s eyes are quite sharp. 
— OK if kus ‘bird’ is anteceded by kel kartal ‘bald eagle’ 


(51) Turkish (plural) 
Kel kartallar, Kuzey Amerika’da bulunurlar. Güç ve hız-ın 
bald eagles North America-Loc are found strength and speed-GEN 
sembolü olarak tanınırlar. Ayerica, kuşların gözleri oldukça keskindir. 
symbol as recognized Also birds-GEN eyes quite sharp 


‘Bald eagles are found in North America. They are the symbol of strength 
and speed. Also, birds’ eyes are quite sharp: 
— * if kuşlar ‘birds’ is anteceded by kel kartallar ‘bald eagles’ 


Finally, bare non-mass kinds in article-less languages without number mor- 
phology (e.g. Mandarin, Japanese) are expected not to have definite/anaphoric 
interpretations. ^ is defined for such nouns, since these languages do not have 
singular morphology that would clash with plurality associated with kind forma- 
tion (recall also 39; see Dayal 2004: 411-413). In terms of anaphoricity/definiteness, 
bare non-mass kinds in these languages should pattern with plural kinds (and 
mass kinds) in languages like Serbian and Turkish. This also appears to be borne 
out, as shown in (52) and (53). The non-mass noun tori ‘bird’ in (52) cannot 
be anteceded by hagetaka ‘bald eagle’, in contrast to (46-48). As already men- 
tioned in footnote 8, Jenks (to appear) shows that Mandarin makes a systematic 
distinction between unique and anaphoric definites (e.g. Schwarz 2009); while 
unique definites are realized as bare nouns, anaphoric definites are realized with 
a demonstrative, except in subject positions, where bare nouns can also be inter- 
preted anaphorically. Examples in (20) which involve object-level interpretation 
are consistent with Jenks' observations in that bare nouns in subject positions 
can be used anaphorically. Bare nouns in (14) and (53), on the other hand, lack 
anaphoric readings precisely because they are derived by "^, which is responsible 
for the kind-level interpretation. 
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(52) Japanese 


(53) 


Watashi-wa nagai aida hagetaka-o kenkyu shitekita. Tori-wa 
I-roP long time bald eagle-Acc studied bird-roP 
subarashi. 

fantastic 


‘I have been studying the bald eagle for a long time. The bird is fantastic: 
— * if tori ‘bird’ is anteceded by hagetaka ‘bald eagle’ 


Mandarin 
Zhiyou gezi he daxingxing xingcun zai zhe pian dalu shang. 
only pigeon and gorilla survive Loc this CLF continent on 


Danshi hen kuai niao jiu miejue le. 

but ` very quickly bird PTCP exinct Asp 

“Only the pigeon and the gorilla survived on the continent. But very 
quickly the bird went extinct: 

— “if niao ‘bird’ is anteceded by gezi ‘pigeon’ 


5 Summary and further questions 


The initial contrast in interpretation between mass kinds in English and lan- 
guages without definite articles led us to an analysis from which some rather 
systematic patterns appear to emerge. 


Table 1: Languages without definite articles: Bare nouns 


+Number —Number 
Kind-level Object-level Kind-level Object-level 
Mass Count Mass Count Mass | Count | Mass Count 
PL SG PL 
Anaphoric i x "4 Viiv ii * V V 
Type-shift n n L L L n n L L 


T n undefined for singular nouns; ı applies to the taxonomic domain 


As Table 1 above shows, the availability of anaphoric/definite readings of bare 
nominals in languages without definite articles correlates with the availability 
of ^ and i. More specifically, whenever " applies, the anaphoric/definite reading 
is missing. We see that object-level and kind-level readings are available both in 
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languages with number marking (e.g. Serbian) and in languages without number- 
marking (e.g. Japanese). ı is responsible for anaphoric interpretation of object- 
level bare nouns in both types of languages. Where the two language types differ 
is how they manufacture kinds. In languages without number marking, all kinds 
are created via ^, which means that bare kind-level nouns in these languages 
cannot be interpreted anaphorically. In other words, since count nouns in these 
languages do not mark number (and are used with classifiers etc.), they pattern 
with mass nouns and are accessible to ^. But in languages with number marking, 
kind-level singular count bare nouns cannot be formed via ^, due to a clash with 
singular number morphology. This is repaired by (42), which introduces i. As a 
result, only this type of bare kind-level noun will have anaphoric potential. For 
bare mass and plural nouns, both ı and ^ are available, given the modified ranking 
of operations in (36), according to which they are both more highly ranked than 
3. Which one of them applies will depend on the context (among other things). 
In contexts like (31b), ı applies and creates the anaphoric reading. But if a kind- 
level interpretation of the antecedent noun is forced by the context (as in 33), 
the anaphoric relation will be missing; ı maps property extension to individu- 
als, and a kind is identified with the totality of its instances in any given world 
(or situation). If, on the other hand, ^ applies, the anaphoric relation will still 
be absent, since ^ is a function whose extension varies from world/situation to 
world/situation (while ı is a constant function to a contextually anchored indi- 
vidual). 

Now, as already noted, " is the intensional counterpart of ı, and Dayal (2004) 
takes the latter to be the canonical meaning of the definite determiner. One of 
significant cross-linguistic patterns discussed in Dayal (2004) is the absence of 
dedicated kind determiners in natural language. That is, plural kind terms are 
either bare (e.g. English, Hindi), or definite (e.g. Italian, Spanish). A simple ex- 
planation for this robust generalization is that ^ is the intensional counterpart of 
ı and that languages do not lexically mark extensional/intensional distinctions. 
There are additional systematic restrictions: for example, if a language uses bare 
nominals for anaphoric readings, then it also uses them as plural kind terms. Also, 
if a language uses definites as plural kind terms, it also uses them for anaphoric 
readings. Thus, correlations are not completely arbitrary; e.g. there are no at- 
tested languages in which bare plurals could be used anaphorically and at the 
same time definite plurals could refer to kinds. To account for these facts, Dayal 
proposes a universal principle of lexicalization in which ı (which is canonically 
used for anaphoric reference) and ^ (which is canonically used for generic ref- 
erence) are mapped along a scale of diminishing identifiability: ı >". Languages 
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can then lexicalize at distinct points on this scale, proceeding from ı to ^. Lan- 
guages without determiners like Serbian use the extreme left as the cut-off for 
lexicalization - in such languages both ı and " are covert type shifts. The cut-off 
point for mixed languages like English is in the middle - here ı is lexicalized (the) 
and ° is a covert type-shift. ı and ^ are both encoded lexically in obligatory deter- 
miner languages like Italian, where the cut-off point is at the extreme right. So if 
a language has a lexical determiner for plural kind formation, this automatically 
means that its cut-off point is at the extreme right. The principle of lexicalization 
above therefore entails that such a language could not have a covert 1. The unat- 
tested language type mentioned above would then not conform to the proposed 
direction of lexicalization.!? 

We can also view the relationship between ı and ^ from the perspective of 
Schwarz's (2009) account of strong/weak definites. Schwarz discusses a distinc- 
tion between strong and weak definite articles in German: strong articles are used 
in familiar definite environments and are anaphoric to a previously introduced 
referent, while weak articles occur in unique definite contexts. Schwarz proposes 
that strong (anaphoric) definites take an index as an argument, while unique def- 
inites do not (see also Jenks to appear). That is, anaphoric articles are more com- 
plex than their unique counterparts since they take one extra argument. At the 
same time, both types of articles presuppose the existence of a unique individ- 
ual. Jenks (to appear) shows that different languages lexicalize/mark these two 
types of definites differently. Languages like German and Lakhota (see Schwarz 
2013) have two separate lexical items/markers to encode unique definites (i.e. /) 
or anaphoric definites (i.e. /*). There are also languages like Fante Akan and Man- 
darin (see footnote 8) which have a lexical definite marker for definite anaphoric 
environments (i.e. *), but no marker for unique definite contexts (covert type 
shift is used). And finally there are languages like English that use a single lexi- 
cal item for both types of definites. We could add to this list languages like Ser- 
bian which can use covert type shifts for both environments. But if Schwarz and 
Jenks are right in making a distinction between the unique : and the anaphoric 
1“ (which I believe they are), then the facts discussed here strongly suggest that 
^ is the intensional counterpart of the unique ı and not the anaphoric ı*. This 
is further supported by the fact that in German it is the weak (unique definite) 


Languages like Brazilian Portuguese and German are particularly interesting because they 
allow a certain degree of optionality. Brazilian Portugese admits bare singulars while some 
dialects of German allow both bare and definite plurals/mass terms for kind reference, but the 
variation in available meanings is still quite limited. For detailed discussion of these languages 
see Dayal (2004; 2011), Krifka (1995), Müller (2002), Munn & Schmitt (2005), Cyrino & Espinal 
(2015) and references therein. 
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article that is used for kind reference (e.g. Schwarz 2009: 65-66). That is, if lan- 
guages do not lexically mark extensional/intensional distinctions and if ^ is the 
intensional counterpart of the unique ı, then it follows that in languages which 
use two separate markers for unique and anaphoric definites, the unique definite 
marker will also be used for kind reference. 

I have to leave some questions for future work, since they are outside of the 
scope of this study. For example, I showed that if a demonstrative is added to 
the constructions with kind-level context, the anaphoric reading becomes pos- 
sible. The question is, of course, how this should be formalized. At this point 
I have to assume that this is due to some specific property of this lexical ele- 
ment.P For instance, Chierchia (1998: 353) proposes (for independent reasons) 
that determiners may semantically come in two variants: those that apply to 
predicates and those that apply to kinds. One possibility is that a demonstrative 
like Serbian to ‘that’ has both types of interpretations and can therefore combine 
with kinds.^ Another question which should be more directly investigated 
is what kind of discourse factors facilitate or inhibit the anaphoric reading of 
bare nouns and how they can be distinguished from those discussed in this pa- 
per. It is clear that, in terms of anaphoricity, ı (i.e. a bare noun) is less potent than 
demonstratives and pronouns (see Footnote 13). The question is then whether this 


BSimilar questions can be raised with respect to kind-referring pronouns that can be anteceded 
by non-kind NPs. In (i) below, for example, the antecedent Martians refers to some Martians, 
while themselves refers to the kind (see Rooth 1985 and Krifka 2003 for details). So the next step 
would be to check whether constructions like (i) are allowed in languages discussed here (in 
particular, whether both coreference and anaphoric binding are possible) and then what kind 
of implications would such facts have for the analysis presented here. I have to leave this for 
future work. 


(i) At the meeting, Martians presented themselves as almost extinct. 


“This line of reasoning would be supported by a language which makes some kind of morpho- 
logical distinction between the two determiner variants. This seems to be true for Serbian (and 
some other Slavic languages), at least to a first approximation: in addition to taj ‘that’, which 
seems to be ambiguous as noted above, there are also determiners like takav which are best 
translated as ‘that kind’ (also kakav ‘what kind’, onakav ‘that kind’, etc.). This, however, re- 
quires a more careful examination, which I leave for future work. 

SIt needs to be clarified that the presence of demonstratives does not necessarily indicate the 
presence of DP (or some other functional projection) in languages without articles. For exam- 
ple, as discussed in Bošković (2005), Despić (2011; 2013), Zlatić (1997), etc., it is much more 
plausible to analyze demonstratives (and possessives) in Serbian as NP-adjuncts. A number 
of morpho-syntactic arguments support this claim: the availability of LBE, the appearance of 
Serbian possessives and demonstratives in adjectival positions (and adjective-like agreement), 
stacking up, impossibility of modification, specificity effects, etc. This is based on syntactic 
evidence, and as long as the demonstrative is assigned appropriate meaning, semantic compo- 
sition is not affected. 
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contrast can ultimately be reduced to some version of blocking (elsewhere) con- 
dition that governs the distribution of covert and overt elements (e.g. use overt 
demonstratives/pronouns wherever you can and avoid the covert 1), or whether 
the anaphoric potential of ı is truly impoverished compared to that of demon- 
stratives/pronouns. 

Overall I hope to have shown that the general pattern of cross-linguistic vari- 
ation given in Table 1 follows from Dayal’s (2004) approach, which is based on 
a limited set of type-shifting operations constrained by the Blocking Principle, 
and which incorporates an appropriate analysis of number morphology. 
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1 first person PTCP particle 

ACC accusative PST past 

cLF classifier PRS present 

DKP Derived Kind Predication REFL reflexive 

ERG ergative TOP topic 

GEN genitive UDP Universal DP (Approach) 


LBE Left Branch Extraction 
LOC locative 
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In the literature on generic nominal reference, it is usually pointed out that in 
Russian, both singular and plural nominal expressions can have a generic reference 
(Chierchia 1998; Doron 2003; Dayal 2004). The main contribution of this article is 
to propose an explicit analysis for composing definite kinds from bare nominals 
in this language. We provide independent empirical support for the definiteness 
of apparent bare nominals in argument position of kind-level predicates and argue 
that definiteness is to be associated with a null D(eterminer), interpreted as the 
iota operator. The general hypothesis we defend is that definite kinds, even in a 
language without articles such as Russian, encode definiteness semantically and 
syntactically. 


1 Introduction 


In the literature on generic nominal reference it is usually pointed out that in 
Russian, a language without articles, both bare singular and bare plural nomi- 
nal expressions can have a generic reference (Chierchia 1998; Doron 2003; Dayal 
2004). This is exemplified in (1), where nouns specified morphologically for singu- 
lar (1a) and for plural (1b) occur in argument position of a k(ind)-level predicate.! 


!In this paper, we assume a three-way classification of verbal predicates into k(ind)-level, 
i(ndividual)-level and s(tage)-level (Carlson 1977). While k-level predicates appear to form a 
scarce but stable class, it is well known that the division line between i- and s-level predi- 
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In this context both panda and pandy can be said to refer to kinds. 


(1) a. Panda naxoditsja na grani isceznovenija. 
panda.Nom.sc is.found on verge extinction.GEN 


b. Pandy naxodjatsja na grani isceznovenija. 
panda.NoM.PLare.found on verge extinction.GEN 


A common background assumption considers plural generics as more natural 
and preferable, so in a significant part of literature on genericity it is taken for 
granted that plurals (bare plurals in English) constitute the "default" way to re- 
fer to kinds.? Setting aside the question of what is the “default” way to express 
genericity in the nominal domain in Russian, we simply point out that, given 
that (1a) is grammatical and natural, an analysis of it is needed in the theory of 
grammar in any case. 

In contrast to Russian, in a language with overt determiners, English for in- 
stance, the subject of a sentence corresponding to (1a) will be expressed by means 
of a definite generic (Carlson 1977) or the singular generic (Chierchia 1998) the 
N construction (i.e. the panda), as in (2a). On the other hand, English also allows 
bare plurals to refer to kinds, as illustrated in (2b). 


(2 a. The panda is on the verge of extinction. 


b. Pandas are on the verge of extinction. 


The correspondence between the so-called English definite generic and the 
Russian bare nominal with a kind reference interpretation in (1a) is usually as- 
sumed to hold merely on the basis of their singular number morphology (cf. 
Dayal 2004), so a reasonable expectation is that the analysis assumed for defi- 
nite generics in English can also be extended to the corresponding Russian cases. 
This approach has to address at least the following issue. Any analysis of the 
English definite generic includes the iota operator (1) in the semantic represen- 


cates is not clearly marked. For instance, fly in (i.a) denotes an i-level property while in (i.b) it 
functions as an s-level predicate: 


(i) a. Hummingbirds fly backwards. 
b. Hummingbirds are flying over the lake. 


?See Ionin et al. (2011) for an experimental investigation on the expression of genericity in 
English, Spanish and Brazilian Portuguese. 
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tation (cf. Chierchia 1998, Dayal 2004), which is quite indisputable for English, 
given that these expressions appear with a definite article.” 

More generally, a number of questions arise with respect to (2) if we take into 
account some cross-linguistic data. In Spanish, for instance, bare plurals do not 
have a generic reading (Laca 1990; Dobrovie-Sorin & Laca 1996; 2003), making 
them different from bare plurals in English (e.g. 2b), which are considered to 
be the genuine expression of kind reference in that language (Longobardi 1994; 
2001; 2005; Chierchia 1998; Dayal 2004, i.a.). By contrast, the default way to refer 
to kinds in Spanish is by means of a (non-plural) common noun preceded by a 
definite article (Borik & Espinal 2015). The question is then how to derive a kind 
reference for languages like Spanish and English keeping in mind these crucial 
differences concerning the interpretation of bare plurals. A look at languages like 
Russian makes the issue even more complex: Russian, does not have any articles 
but clearly possesses the means to make reference to kinds, as shown in (1). Does 
this mean that the same type of analysis as for English and Spanish could or 
should be extended to Russian despite the observed superficial differences in the 
syntax of nominal phrases?* 

This paper aims at contributing to an understanding of kind expressions of 
the type exemplified in (1a). We provide independent empirical support for the 
definiteness of the subject in (1a), and argue that it is to be associated with a 
null D(eterminer), interpreted as i. We postulate the structure in (3a) for definite 
kind arguments in languages with and without articles (e.g. Germanic, Romance, 
Slavic), the meaning of which is represented in (3b). 


(3) a. [peD[ueN]] 
b. [Def N] = ıx*[P(x*)] 
where P corresponds to the descriptive content of a noun N, and 
x* € K (i.e. the domain of kinds) 


Although we do not deal with plural kind expressions exemplified in (1b) in 
this paper, we would like to point out that they do not constitute a counterex- 
ample to our analysis for (1a). We assume that a different syntactic and semantic 


? Although see Coppock & Beaver (2015), who argue that definiteness as encoded by the defi- 
nite article must be distinguished from determinacy, which consists in denoting an individual. 
Should this claim also be adopted for Russian, it would need an independent motivation, since 
Russian does not overtly express definiteness. 

“See also Cyrino & Espinal (2015) for an analysis of definite kinds and definite plural generics 
within the NP/DP debate in Brazilian Portuguese, a language that allows the omission of the 
article in all argument positions. 
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composition is to be associated with the generic (bare) plural in (1b). In partic- 
ular, the analysis proposed in (Chierchia 1998), in which plural kind nominals 
are semantically derived by the down operator that applies to plural proper- 
ties, could be adopted to account for plural generics in Russian. Our hypothesis 
(which we will not defend or justify further in this paper) with respect to plural 
kind nominals in Russian is, therefore, that these expressions are, indeed, derived 
from pluralities and are specified for Number, namely, for plural. Their structural 
representation would then look like in (4). 


(4) N Eump Num; auch) 


The differences between (3a), the structure that we adopt for definite kinds, 
and (4), the structure that we would hypothesize for generic plurals, are obvious. 
First of all, definite kinds are syntactically and semantically definite and hence 
are structurally represented as full DPs, whereas there is no a priori evidence to 
suggest that the same holds for generic plurals.” Secondly, only in the structure 
for generic plurals Number is present. We will not deal specifically with the 
syntax and semantics for Number in this paper, but in general, we assume that 
definite kinds are syntactically and semantically numberless, at least in those lan- 
guages where nominals inflect for number (see Borik & Espinal 2015 for details). 

The paper is organized as follows. §2 presents the theoretical framework that 
constitutes the basis for our analysis. We will introduce the fundamental theo- 
retical claims regarding the composition of definite kinds, focusing, in turn, on 
the meaning of Ns (properties of kinds) and the meaning of the definite article 
(1). In 83 we will present our analysis of definite kinds in Russian. With this aim 
in mind we will provide both semantic arguments for definiteness and syntactic 
arguments for a DP structure with a null D (translated as 1). This section will 
close with an account of modified definite kinds. $4 will conclude the paper. 


This matter, however, deserves a full and thorough investigation, which falls outside the scope 
of this paper. 

*We differentiate between morphophonological number, on the one hand, and syntactic Num- 
ber, which is always interpreted semantically, on the other. In Russian, any nominal expression 
is marked for number and case and these two specifications come as a cluster. In other words, 
it is impossible to determine which part of a cluster encodes number and which part encodes 
case, which is a standard feature of a language with synthetic morphology. We assume that this 
cluster does not necessarily correspond to a syntactic Number projection, which has to have a 
semantic effect, and yield either a singular or a plural interpretation for a nominal phrase (cf. 
Ionin & Matushansky 2006; Pereltsvaig 2013 for similar claims). 
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2 Theoretical background 


In this section we will briefly summarize the theoretical assumptions or postu- 
lates underlying our account of definite kinds in natural languages. 

We assume that definite kinds express D-genericity (cf. Krifka et al. 1995) and 
argue that they are composed by applying ı, which is encoded by the definite arti- 
cle, to the denotation of a common noun, which denotes properties of kinds. This 
proposal is conceived as a universal principle, no matter whether the languages 
considered have overt articles (such as English) or not (such as Russian). 

We start this section by discussing the meaning of common nouns. We argue 
that they denote properties of kinds (Espinal & McNally 2007a,b; Dobrovie-Sorin 
& Pires de Oliveira 2008; Espinal 2010; Espinal & McNally 2011). Next, we discuss 
the meaning of the definite article, conceived as a maximality operator (Sharvy 
1980), and the composition of a definite kind reading. 


2.1 Theoretical postulate 1: Root common nouns denote properties of 
kinds 


Kind reference in natural language is quite often assumed to be a special type of 
reference contrasted with the reference to objects. In other words, if objects are 
standard entities of the semantic ontology, so are kinds. This theoretical hypothe- 
sis can be traced back to at least Carlson (1977), who distinguished between three 
types of entities relevant for natural language semantics: kinds, that is, the deno- 
tation of the panda and pandas in (2); objects, that is, the denotation of proper 
names and common noun phrases; and stages, i.e. the denotation of the last type 
of nominal expressions in combination with stage-level predicates. Kinds and ob- 
jects, in Carlson's typology, are abstract entities and together they form a class 
of "individuals", whereas stages are concrete spatio-temporal realizations of ab- 
stract entities. 

In less fine-grained classifications of entities, only two types are recognized: 
kinds and objects (cf. Zamparelli 1995).7 This is the ontology assumed here as 
well: we distinguish between kinds, or abstract entities, and objects, or particular 
entities, although we do not agree with Carlson (1977), Zamparelli (1995), and 
many others after them, for whom the denotation of a common noun is a kind 
entity. 


"In a different terminological tradition (e.g. Vergnaud & Zubizarreta 1992) this distinction cor- 
responds to types vs. tokens. 
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Under a different approach it is claimed in the semantic literature that common 
nouns denote properties, rather than entities (Chierchia 1984; 1998; Partee 1986 
among many others), that is, common nouns are lexical predicates. 

In this paper, we adopt a third alternative and postulate that common nouns 
denote properties of kinds.? This alternative has been empirically motivated in 
a number of recent proposals, including Dobrovie-Sorin & Pires de Oliveira's 
(2008) work on bare nouns in Brazilian Portuguese, McNally & Boleda's (2004) 
analysis of relational adjectives, and Espinal's (2010) and Espinal & McNally's 
(2007b; 2011) semantic description of the meaning of bare nouns in object posi- 
tion in Catalan and Spanish. The arguments supporting the hypothesis that com- 
mon nouns denote descriptions of kinds are based on pronominalization, number 
neutral interpretation and adjective modification. The reasoning is the following: 


(i) A common noun (a real bare nominal) cannot be taken to refer to individ- 
ual object-entities because the anaphoric pronoun that it licenses (in some 
Romance languages) is not compatible with an object/token interpretation 
(cf. the difference between Catalan en lit. one, referring to properties, and 
el/la/els/les lit. 3RD.ACC.SG/PL.MASC/FEM 'it/them’); if it cannot denote an en- 
tity, it must denote a property. 


(ii) If a common noun has a property denotation, it has no inherent number 
information, and therefore it has a number neutral interpretation (i.e. it 
is compatible with atomicity and non-atomicity entailments, Farkas & de 
Swart 2003); by contrast, nouns specified syntactically for Number refer 
either to atomic or non-atomic sums. 


(iii) If a common noun had an individual property denotation, it would be ex- 
pected to easily combine with any kind of modifier, but this is not the 
case. Bare nouns in syntactic positions that allow bare nominals (e.g. in 
object position of a restricted class of predicates (Espinal & McNally 2007b; 
2011) and in predicate position of copular sentences (de Swart et al. 2007; 
Zamparelli 2008)) can only combine with classifying adjectives, and this 
restriction can be explained only if both expressions are taken to denote 
properties of kinds or if the appropriate adjectives are kind modifiers. 


We thus conclude that it is highly plausible to assume the denotation of a 
common noun to be a property of a kind.” 


*We adopt this hypothesis for all types of nouns, i.e. count, mass and abstract nouns. 

?This view should be contrasted with those in which the interpretation of a nominal root is 
equivalent to that of a mass noun (Borer 2005; Rothstein 2010), and with those that derive 
taxonomic kinds in the lexicon by a direct application of the MASS operation to a N,oot (Pires 
de Oliveira & Rothstein 2011; Trugman 2013). 
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Now, what precisely does it mean to say that common nouns denote properties 
of kinds? We assume that there are two domains in our semantic ontology, the 
domain of objects and the domain of kinds. Under a standard view, the denotation 
of the predicate with the descriptive content P is the set of objects that share 
property P. Thus, the denotation of the noun boy in the domain of objects is 
a set of objects that have the boy-property. Note, however, that in our world 
some nouns can denote singleton sets (e.g. sun or moon). Without challenging 
the process described above, we propose that instead of the domain of objects, 
common nouns range over kinds, conceived as integral entities. Thus, the same 
noun boy in our proposal looks for entities that share a boy-property but in the 
domain of kinds rather than objects. 

In accordance with what we have just said the meaning of a common noun 
should have the logical representation in (5), where P stands for a property cor- 
responding to the descriptive content of N, and x* a kind entity, such that the 
property P applies to xk. 


6) IN] = Ax*[P(x*)] 


Having given a formal definition of the denotation of a common noun, we will 
now briefly clarify our more general assumptions about kinds, although we do 
not pretend to give a full justified answer to the question of what type of entities 
kinds essentially are. Following Borik & Espinal (2015), we adopt the claim that 
kinds are not sets of subkinds, but are instead perceived as integral, undivided 
entities with no internal structure, which means that kinds do not form part of 
a standard quantificational domain for individuals represented by a lattice struc- 
ture (Link 1983). We also share the view of Mueller-Reichau (2011), according to 
whom kinds are, in essence, abstract sortal concepts. Sortal concepts are mental 
representations that are used to "categorize and individuate objects" (Mueller- 
Reichau 2011: 21). Thus, kinds are entities, but their (mental) representations are 
obtained by abstraction over a number of individual objects that share certain rel- 
evant properties. This, however, does not necessarily mean that linguistically, a 
kind should necessarily be construed as a set of representative objects, although 
conceptually it might be the case. 


2.2 Theoretical postulate 2: The definite article corresponds to ı and 
expresses maximality 


In Partee (1986), it is proposed that definite noun phrases are generated by a 
type shifting operator that maps a singleton property <e, t) onto an individual 
denotation of type (e. This type shifting operation is called iota. In this sense, 
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the meaning of the definite article is to map a property onto the maximal/unique 
individual having that property.’ 


(6) [Dper]] = P > ix[P(x)] 


When the definite article applies to a noun that denotes a property of a kind, 
the iota operator yields a maximal/unique kind entity. This is how definite kind 
expressions are derived. Crucially for our analysis, in the composition of definite 
kinds, there is no intervener between the iota operator, associated with the defi- 
nite article (in languages with articles), and the noun. We illustrate this derivation 
in example (7). 


(7 a. The panda is on the verge of extinction. 


b. [pp the [np panda]] 
c. [[the panda] - ixF [panda(x*)] 


The subject of (7a), repeated from (2a), is a definite kind expression derived by 
applying the iota operator to the noun panda. Its syntactic structure is given in 
(7b), and the semantic composition associated with this expression is provided 
in (7c).!! This is the essence of our analysis of definite kinds, which we would 
like to extend to Russian. In this section, we have presented the fundamental 
theoretical postulates on which we base our analysis of reference to kinds in 
natural languages. We now address the main issue of this paper, namely, the 
question of whether Russian has definite kinds, in spite of the fact that it has no 
overt articles, and which are the arguments that support the existence of definite 
kinds in this language. 


3 Definite kinds in Russian 


As we pointed out in $1, the correspondence between the English definite kind 
expression in (2a) and the Russian bare nominal in (1a) (repeated in 8) with a 
kind reference is usually assumed to hold, and a reasonable expectation is that 
the analysis adopted for definite kinds in English can also be extended to Russian 
cases. 


The terms maximal and unique are used in this paper in the sense of Sharvy (1980) and Link 
(1983), who provide a unified semantics for definiteness, independently of whether the defi- 
nite article combines with a singular or a plural expression. Thus, these terms should not be 
confused or even associated with plural and singular number, respectively. 

"Once again, we propose this derivation for all types of nouns, i.e. count, mass and abstract 
nouns. See Borik & Espinal (2015) for details. 


300 


9 Definiteness in Russian bare nominal kinds 


(8) Panda naxoditsja na grani isceznovenija. 
panda.NoM.sc is.found on verge extinction.GEN 


"Ihe panda is on the verge of extinction: 


However, any analysis of English definite kinds includes at least the iota op- 
erator in the semantic representation (cf. Chierchia 1998, Dayal 2004). The iota 
operator is standardly assumed to correspond to the definite article, a claim that 
we do not want to challenge. However, in the absence of articles in Russian, we 
should be able to find other independent evidence that the iota operator is, in- 
deed, present in the semantic representation of the subject argument in (8) and 
not merely assume that it is there due to an interpretation that corresponds to 
the English kind nominal. In 83.1 and 83.2 we provide independent empirical 
semantic and syntactic arguments for the definiteness of the subject in (8) and 
argue that it is to be associated with a null D(eterminer), interpreted as the iota 
operator. 


3.1 Semantic definiteness of kind referring expressions 


The core of the argument that we employ to prove that Russian definite kinds 
are really semantically definite is based on the use and interpretation of these 
expressions in a context that requires definiteness. The following context can 
show that kind-referring expressions behave like proper definites. 


(9) Context: In a biology lesson, the teacher explains various things about 
mammals. She explains that there are many endangered species in the 
world, then says the following: 

The whale, for instance, is on the verge of extinction. 


Note first that in English, the only morphologically singular expression that 
can refer to the species itself, and not to a subkind or an individual whale, is the 
definite one, i.e. the whale (Jespersen 1927), which we claim to be unspecified for 
Number. A DP with a demonstrative or a numeral, as illustrated in (10), will not 
get the same interpretation as the definite kind expression in (9). 


(10) a. This whale, for instance, is on the verge of extinction. 


b. One whale, for instance, is on the verge of extinction. 


(10a) with the demonstrative can only be acceptable if the teacher points di- 
rectly to a picture of a representative instance of the corresponding type of whale 
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(say, a blue whale), and thus, refers to a subkind via a representative, and (10b) 
can only refer to a subkind of whale as well. 

In Russian, in the context of (9), the only expression that can be used is the bare 
noun kit, as illustrated in (11). Kit in (11) has exactly the same interpretation as 
the overt DP the whale in English, and cannot get an interpretation comparable 
to (10a) or (10b). This strongly suggests that kit in (11) corresponds to a definite 
kind referring expression. 


(1) Kit, naprimer, | naxoditsja na grani isceznovenija. 
whale.nom. for.instance is.found on verge extinction.GEN. 


"Ihe whale, for instance, is on the verge of extinction: 


Note, however, that theoretically, there could still be an option that while in 
English the kind referring DP has to be definite, in Russian it might be indefinite. 
Next, we will discuss why this is not the case. 

Even though it is commonly believed that with k-level predicates indefinite 
DPs can only be interpreted taxonomically, i.e. as referring to a subkind rather 
than to a kind (see Mueller-Reichau 2011 and references therein), Dayal's (2004) 
examples like to invent a pumpkin crusher challenge this standard assumption. 
In this paper, we follow Mueller-Reichau who argues that there is a fundamen- 
tal difference between k-level predicates like to be extinct and the ones like to 
invent. Only the latter allow for reference to novel (non-familiar) kinds, whereas 
the former impose a familiarity condition on the argument. This is why, by de- 
fault, A blue whale is in danger of extinction can only be interpreted as referring 
to a subkind of the blue whale, whereas Fred invented a pumpkin crusher can be 
interpreted as referring to the kind pumpkin crusher, as well as to a subkind of 
crusher.!? This distinction between different types of k-level predicates is both 
empirically motivated by the examples just given and by our intuition: it is dif- 
ficult for something that has not existed before to become extinct, therefore, to 
be extinct requires familiar entities. By contrast, it is expected that if someone 
invents something, they will invent novel entities. 

We observe similar effects in Russian with the same type of predicates: in (12a) 
an indefinite description can only refer to a subkind of whale, but the nominal in 


?We thank an anonymous reviewer for the observation that Fred invented a pumpkin crusher 
allows for two interpretations: the kind ‘pumpkin crusher’ and a subkind of ‘crusher’. Our 
intuition is that this is due to the fact that the object NP contains a modified noun. Thus, if 
we consider a non-modified NP, as in Steve Jobs invented an i-pod only the subkind reading is 
salient. 
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object position in (12b) can refer, indeed, to a new kind of artifact, a ‘mechanical 
calculator’, as well as to a subkind of 'calculator'.? 


(12) a. Odin kit naxoditsja na grani isceznovenija. 
One whale.NoM.sc is.found on verge extinction.GEN 


‘One whale is in danger of extinction’ 


b. Fred izobrel odnu séetnuju masinu. 
Fred invented one.Acc.sc calculating.Acc.sc machine.Acc.sc 


‘Fred invented a mechanical calculator: 


Thus, we have all reasons to believe that the same distinction between different 
types of k-level predicates that Mueller-Reichau postulates for English also holds 
in Russian. Crucially, according to this view, with predicates of the extinct-type, 
"the speaker presupposes the existence of instances of the kind X as known to 
the hearer" (Mueller-Reichau 2011: 80). This lexical specification blocks reference 
to a kind for an indefinite expression in the context of extinct-type predicates.!* 

Let us now go back to our example (11). As has just been demonstrated in 
(12a), should the subject of (11) be indefinite, it would necessarily yield a subkind 
reading, which it does not. This allows us to conclude that the subject argument 
in (11) is indeed a definite expression and the semantic representation for this BN 
includes the iota operator, which "supplies" its definiteness, as shown in (13). 


(13) [kit] = ix^ [Kit(x^)] 


The iota operator simply selects the unique entity that refers to the class itself 
(i.e. to the class described by the noun kit), but does not make the denotation 
restricted to a given world. 

The next issue we need to address is what kind of syntactic structure corre- 
sponds to the semantic representation in (13). 


There are overt indefinite markers in Russian, although they are not articles. In (12) we use 
the unstressed version of odin 'one', which we take to be a specificity marker for indefinites 
in Russian (cf. Ionin 2013). If this marker bears stress, it is interpreted as a numeral. Note also 
that not all native speakers readily accept a subkind interpretation for examples like (12a). We 
have encountered judgments that vary from full rejection to full acceptance. 

“Similarly, Stankovié (2016) postulates a complex DP structure for Serbo-Croatian, which in- 
cludes a kind-referring DP embedded under an individual referring DP. He argues that the 
kind-referring DP can only be definite, not indefinite in Serbo-Croatian. 
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3.2 Syntactic arguments for a DP structure 


In example (7b) of 82 we already gave a syntactic structure for the definite kind 
expression in (7a), so it should be clear by now that the general syntactic structure 
associated with definite kinds should look like (14). 


(14) [peD[upN]] 


Syntactically, we defend the claim that definite kinds in Russian are DPs, that is, 
the D-layer is present in the syntactic representation of definite kind arguments 
even though there is no overt realization of the D-projection. 

Before we discuss this analysis, let us point out that we assume a strict corre- 
spondence between syntactic and semantic representations at the syntax-seman- 
tic interface as a null hypothesis. This view on the syntax-semantics interface by 
default requires a consistent syntactic representation for each particular seman- 
tic operation. In the case of definite kinds, the operator that turns the meaning of 
a common noun (i.e. a property of kinds; see 82) into a kind expression is the iota 
operator, which needs to be represented syntactically, unless we assume that all 
nouns are structurally ambiguous and one and the same expression can be as- 
sociated with various syntactic structures. Since there is ample cross-linguistic 
evidence that the iota operator is syntactically represented by the definite arti- 
cle (consider, for example, the situation in Germanic and Romance), we should 
conclude that we need a D projection even for article-less languages where iota 
is not lexicalized. Making this proposal, we follow the insights of Longobardi 
(1994; 2001; 2005), who claims that semantic referentiality (i.e. being a referring 
expression) is associated with a particular syntactic position, namely, the head of 
the DP. This claim could be considered one of the strongest mapping principles 
between the syntax and semantics of natural languages, and it fits neatly with 
the syntax-semantics correspondence that we are assuming in this paper. 

As for Russian, proposals that provide a similar semantic motivation for the DP 
projection with a null D have been made, for instance, by Ramchand & Sveno- 
nius (2008) who argue that the D head in Russian is needed for reasons of se- 
mantic uniformity: this is the head that turns nominal expressions, which are 
originally of property-type <e, t), to arguments, i.e. expressions of type <e}. They 
further suggest that the D head in Russian should be underspecified for features 
like (in)definiteness, (un)specificity, etc., which are determined contextually. This 
means that DPs in Russian can represent definite or indefinite (specific and non- 
specific) arguments, the hypothesis that we adopt in here as well. 

However, the strict syntax-semantic correspondence is a working hypothesis 
that, in and by itself, cannot be taken as an argument for the presence of the DP 
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layer in the syntactic representation of definite kinds in Russian. A well-known 
debate in the literature on languages with and without articles is the discussion 
between the Universal-DP hypothesis (Longobardi 1994; Cinque 2005; Perelts- 
vaig 2007) and the Parametrized-DP hypothesis (Bošković 2005; 2008; Bošković 
& Gajewski 2008; Bošković 2009). According to the former, languages with or 
without articles would have all nominal arguments projected as full DPs and 
would allow null Ds. According to the second hypothesis, however, there exist 
two types of languages, those with articles (like English and Modern French), 
which project arguments as DPs, and those without articles (like Serbo-Croatian 
and Russian), which are postulated to project NPs.” 

We adopt the view advocated by Pereltsvaig (2006), according to which nomi- 
nal arguments can differ in "size", i.e. have different types of syntactic structure 
in argument position, both across languages and language internally. Thus, in 
both Russian and, for instance, English or Spanish, we can find nominal argu- 
ments that syntactically correspond to either full DPs or smaller nominals: NPs, 
NumPs or QPs.!° In Russian, nominal arguments associated with different syn- 
tactic structures exhibit a number of different properties and have a different se- 
mantic interpretation as well. In particular, DP subjects obligatorily agree with 
the verbal predicate, whereas small nominals do not. Agreeing subjects allow an 
individuated / specific interpretation, a non-isomorphic wide scope reading, they 
may control PRO and be antecedents of anaphors, whereas non-agreeing subjects 
do not." To illustrate this difference between agreeing and non-agreeing nomi- 
nal subjects, consider the minimal pair in (15) (from Pereltsvaig 2006: 438-9, ex. 3). 
Example (15a) exhibits number agreement between pjat' izvestnyx aktérov 'five 
famous actors' and the verb, and this agreement is supposed to correlate with 
the distributive individuated interpretation of the subject, in the sense that each 
one of the famous actors played a role in the film. By contrast, in example (15b) 
there is no number agreement between the subject and the verb, the latter being 
in the third person singular neuter default form.! Lack of syntactic agreement 


“The Parametrized-DP hypothesis is given extensive empirical motivation in the literature. 
However, the arguments for the DP/NP split between languages, to the best of our knowl- 
edge, are purely syntactic (e.g. left-branch extraction, negative raising, superiority effects, etc.; 
e.g. Bošković 2008). The proponents of the Parametrized-DP hypothesis usually do not take 
into account the semantic functions attributed to the DP projection as we do in this paper. 

For similar claims in Romance languages see Schmitt & Munn (1999; 2003), Munn & Schmitt 
(2005), Dobrovie-Sorin et al. (2006), Cyrino & Espinal (2015), among others. 

"For details, see Pereltsvaig (2006: 447). 

'8Pereltsvaig (2006) does not indicate sc, but only NEUT, in the gloss for the verb in this exam- 
ple, because nouns, verbs, adjectives and various agreeing elements can express gender only 
in singular. We modified the gloss to include the number specification on the verb plus the 
number and case on the noun for the sake of explicitness. 
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correlates with a group interpretation of the nominal expression. This means that 
the subject argument pjat’ izvestnyx aktérov ‘five famous actors’ is attributed a 
full DP structure with a null D in (15a) but a QP with a numeral in (15b). 


(15) a. V étom filme igrali [pjat’ izvestnyx aktérov]. 
inthis film played.pı five famous  actors.PL.GEN 


‘Five famous actors played in this film? 


b. V étom filme igralo [pjat’ izvestnyx aktérov]. 
inthis film played.sc.NEUT five famous  actors.PL.GEN 


‘Five famous actors played in this film! 


We find Pereltsvaig's proposal that in Russian some nominals are DPs but small 
nominals can be found in the same syntactic position as DPs very plausible, and 
thus we adopt the claim that in all languages, including Russian, there can be 
nominal arguments of different "size", that is, involving a different “amount” of 
functional structure on top of the minimal NP projection, the highest projection 
that a nominal argument can have being a DP. 

Let us now go back to definite kinds and test how arguments of k- and i-level 
predicates behave with respect to some properties listed in Pereltsvaig (2006). 
Note that only some of the properties this author lists can be tested for definite 
kinds. The reason for this is that the majority of Pereltsvaig's arguments are built 
for nominal phrases with various types of modifiers (numerals, adjectives, etc.), 
but kind expressions almost never accept regular modifiers.? We thus focus on 
the following properties that kind arguments can be tested for: control of PRO, 
licensing of anaphors, substitution by pronominal elements and presence of non- 
restrictive relative clauses. We show that all these properties support an analysis 
of definite kinds in Russian as full DPs. 


3.2.1 Control of PRO 


Non-agreeing subjects cannot be controllers for PRO in infinitival clauses, while 
agreeing subjects, being full DPs, can. The contrast is exemplified in (16) (Perelts- 
vaig 2006: 444, ex. 10a). 


(16) [Pjat' banditov]; pytalis’ / *pytalos' [PRO; ubit Dzemsa Bonda]. 
five thugs.PL.GEN tried.PL/ tried.sG.NEUT PRO to.killJames Bond 
‘Five thugs tried to kill James Bond. 


See, however, 83.3 below. 
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Let us now look at definite kinds. As shown in (17), definite kind subjects can 
control PRO of a purpose clause and, hence, pattern with agreeing subjects. Since 
agreeing subjects are argued to be full DPs, we can conclude that the same syn- 
tactic category should be attributed to definite kinds. 


(17 Panda; imeet neobyénye perednije lapy čtoby PRO; 
panda.sc.NoM has.sc unusual front paws in.order.to PRO 
uderzivat’ stebli bambuka. 
hold stems bamboo 


"Ihe panda has unusual front paws to hold bamboo stems’ 


3.2.2 Antecedents of reflexive pronouns 


Our next piece of evidence in favour of the DP status of definite kinds is that these 
expressions can be antecedents of a reflexive pronoun. We start by illustrating 
the contrast between agreeing and non-agreeing subjects with respect to their 
ability to license reflexive pronouns (Pereltsvaig 2006: 455, ex. 11a): only agreeing 
subjects can license reflexive pronouns. 


(18) [Pjat banditov]; ` prikryvali /*prikryvalo sebja, ot — pul 
five thugs.PL.GEN shielded.pı / shielded.sG.NEUT self from bullets 
Dzemsa Bonda. 
james Bond 


‘Five thugs shielded themselves from James Bond e bullets: 
As (19) illustrates, definite kinds pattern likewise. 


(19 Tier: znaet kak zaščitit sebja,; ot napadenija. 
tiger.sG.NOM knows.sc how defend self from attacks 
"Ihe/a tiger knows how to protect itself from being attacked’ 
This example shows that, according to the test, the antecedent of the reflexive 
must be a DP. This DP may be devoid of Number, as in the structure (14) above 


(i.e. the structure postulated for definite kinds), or may have Number. In the latter 
situation, the D can be either definite or indefinite, and either singular or plural. 


3.2.3 Pronominal substitution 


Finally, a pronominal substitution test also shows that definite kinds behave like 
DPs rather than other, “smaller” types of arguments. The test as used in Perelts- 
vaig (2006) shows that third person pronouns can be used to substitute full DPs, 
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but not QPs or NPs, which can only be substituted by other (quantificational 
and/or pronominal) elements. The example below (based on Pereltsvaig 2006: 
446, ex. 15a) shows that the pronominal subject of (20b) can only substitute the 
agreeing subject of (20a). 


(20) a. Pjat’ par tancevali / tancevalo tango. 
five couples.PL.GEN danced.Pr / danced.sG.NEUT tango 
‘Five couples danced tango’ 
b. Oni tancevali / *tancevalo tango. 
they.PL.NOM danced.pı / danced.sG.NEUT tango 


‘They danced a tango’ 


Coming back to definite kinds, it can be easily shown that the definite kind 
agreeing subject in (21a) can only be replaced by a third person pronoun ona 
‘she’, thus supporting the claim that definite kinds are DPs. 


(21 a. Panda naxoditsja na grani isceznovenija. 
panda.sc.NoM is.found.sc on verge extinction.GEN 


b. Ona naxoditsja na grani isceznovenija. 
she.sG.NoM is.found.sc on verge extinction.GEN 


"Ihe panda/She is on the verge of extinction: 


The three arguments just given, which are based on the syntactic tests pro- 
posed in Pereltsvaig (2006) for differentiating between DP arguments and argu- 
ments associated with a "smaller" syntactic structure, all support the claim that 
definite kinds in Russian are syntactically DPs. 

Let us add one more observation to the arguments given above. 


3.2.4 Distribution of relative clauses 


There is a limited number of constructions in Russian where a nominal argument 
seems to have the status of a real bare NP and be associated with a minimal 
possible NP structure with no additional functional layers. A couple of relevant 
examples from Russian is given in (22) (22b is from Borik et al. 2012: ex. 8). 


(22) a. Petja xodit v galstuke, (*kotoryj vsegda nravitsja ego zene). 
Petja goes in tie.sc.oBL which always likes his wife 


‘Petja is a tie-wearer, (“which his wife always likes)? 
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b. Katya nosit ` jubku, (“kotoruju ona vsegda pokupaet sama). 
Katya wear.ımp skirt.sc.acc which she always buys.ımp self 


‘Katya is a skirt-wearer, (“which she always buys). 


The objects galstuke ‘tie’ and jubku ‘skirt’, despite being morphologically mark- 
ed as singular, have a number neutral interpretation (i.e. one or more tie, one or 
more skirt), that is, can denote either an atomic or a plural entity satisfying the 
description of the nominal.? Number neutrality is a hallmark of bare nominals 
in various languages (cf. Farkas & de Swart 2003 for Hungarian; Dayal 2004 for 
Hindi; Espinal & McNally 2011 for Spanish and Catalan, etc.), so this is a good 
reason to assume that the objects in (22), despite being morphologically singular, 
are “true” bare nominals unspecified for syntactic and semantic Number. 

Note, however, that neither galstuke ‘tie’ nor jubku ‘skirt’ in this interpretation 
can be modified by a relative clause.?! We suggest that a reason for blocking a 
relative clause in (22) is that in a real NP structure there is no room for descriptive 
but only for classifying modifiers (which is in accordance with our theoretical 
postulate 1, see 82.1). A classifying modifier but not a restrictive relative clause 
is allowed in (23), under the intended reading that Katya is a skirt-wearer. 


(23) Katya nosit mini-jubku, (“kotoruju ona vsegda pokupaet sama). 
Katya wear.ımp mini-skirt.sc.acc which she always buys.ımp self 


‘Katya is a mini-skirt wearer, (“which she always buys). 
Consider now an example with a definite kind expression: 


(24) a. Amurskij tigr, kotoryj ocen’ opasen, ^ obitaet na jugo-vostoke 
Siberian tiger which very dangerous lives on south-east 
Rossii. 
Russia. 
"Ihe Siberian tiger, which is extremely dangerous, lives in the 
south-east part of Russia. 


°See Kagan & Pereltsvaig (2011) and Pereltsvaig (2013) for other types of number neutral argu- 
ments in Russian. In these papers, it is argued that semantically number neutral nominals are 
plural in Russian. We agree with this claim, but we think that Russian also has morphologically 
singular nominals with a number neutral interpretation. 

This is also a property of bare nominals in the same syntactic position in Romance languages, 
such as Catalan and Spanish. See Espinal & McNally (2011). 
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b. # Amurskij tigr, kotoryj rodilsja v našem zooparke, obitaet na 
Siberian tiger which was.bornin our zoo live on 
jugo-vostoke Rossii. 
south-east Russia 


"Ihe Siberian tiger that was born in our zoo lives in the south-east 
part of Russia’ 


As can be seen in (24a), definite kinds allow subsequent modification by a non- 
restrictive relative clause. Non-restrictive (or appositive) relative clauses do not 
restrict the (set of) referents denoted by the nominal phrase, they just provide 
additional information about an already established referent. By contrast, as the 
example (24b) illustrates, a relative clause that can only be interpreted restric- 
tively, imposes an individual (as opposed to a kind) interpretation on the subject 
of the clause, which is then difficult to combine with the verbal predicate obitaet 
‘to live’ that normally selects for kinds.?? 

Let us now go back to the claim that we made at the beginning of the section, 
namely, that the incompatibility of restrictive relative clauses with definite kinds 
can be seen as an additional argument for the DP status of the kind nominal. We 
now explain why it should be so. 

Semantically, non-restrictive relative clauses are not interpreted in the scope 
of the determiner, as the following examples from English illustrate: 


(25) a. [[The public transport], [which is state-owned]], is fast, clean and 
reliable. 


b. [The [public transport which is state-owned]] is fast, clean and reliable. 


The example in (25a), which is interpreted non-restrictively, can be rephrased 
as a conjunction: 'the public transport is fast, clean and reliable and it is state- 
owned'. It does not imply (in fact, it cannot imply) that there is any other public 
transport except for the state-owned. The example in (25b), on the other hand, 
implies that not all the public transport is owned by the state and it is clear that 
the definite determiner the in (25b) has the whole nominal phrase, including the 
relative clause, in its scope. 

Jackendoff (1977) suggested that the difference between restrictive and non- 
restrictive relative clauses should be reflected in their syntactic configuration, in 


22Two notes are in order here. First of all, Russian has several verbs that can be translated as ‘to 
live’, and the one used in example (24) is often used with kind nominals since its lexical meaning 
is closer to ‘to live permanently, to inhabit’. Secondly, the # sign in front of (24b) means that 
the subject can, in principle, be interpreted as referring to an individual tiger, although it takes 
a certain effort to get this interpretation, at least for one of the authors of this paper, and the 
intuition is that this interpretation is an effect of coercion. 
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the sense that the latter adjoin higher in the structure than the former. Demir- 
dache (1991) specifically proposed that non-restrictive relatives are adjoined to 
DP, although only at LF. De Vries (2006) postulates that appositive relative 
clauses should be represented as a coordination of DPs, an appositive relative as 
a specifying conjunct to the visible antecedent. Arsenijevic & Gracanin-Yuksek 
(2016) also argued that the configurational differences between restrictive and 
non-restric-tive relative clauses should be reflected in overt syntax on the ba- 
sis of agreement facts in Bosnian/Serbian/Croatian. Generalizing over these and 
many more works on relative clauses, we can say that the main idea is that non- 
restrictive relatives can only have a DP as an antecedent. There is no a priori 
reason to believe that Russian non-restrictive clauses would be different in their 
syntax and semantics. Therefore, we take (24a) to be another piece of evidence 
in favor of the DP status of definite kind expressions. 

The discussion of relative clauses once again supports the point made by Per- 
eltsvaig (2006): we should allow for different structures to be associated with 
nominals in argument position. (24a) above indicates that definite kinds cannot 
be NPs, as we have seen that true bare NPs do not take relative clauses, restric- 
tive or non-restrictive. If we consider the empirical contrast between (23) and 
(24a), together with Pereltsvaig's arguments discussed earlier in this section, the 
conclusion that we logically arrive at is the same: definite kinds in Russian are 
DPs. 

This conclusion allows us to preserve the correspondence between the pres- 
ence of D projection and the contribution of the iota operator, which, as we have 
seen above, is realized as a definite article in languages with articles. Our claim 
for an article-less language like Russian is, thus, that the syntactic representation 
of definite kinds involves a null D, which is translated as the iota operator, too. 


3.3 Modified definite kinds 


In 83.2 we have provided syntactic arguments for a DP structure. Still, a question 
that remains to be answered is whether definite kinds allow any sort of modifi- 
cation inside the DP. We think that the answer to this question is positive, and, 
following Borik & Espinal (2015) for Spanish, we show in this section that Russian 
has kind expressions with modifiers, which we call modified kinds. 

Modified kinds are ind-referring expressions composed by a noun and a mod- 
ifier, normally expressed by an adjective, provide an additional semantic argu- 
ment for the definiteness of Russian bare nominal kinds. Consider the data in 
(26). 
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(26) a. Amurskij tigr zanesen v Krasnuju knigu. 
Siberian tiger registered in Red book 


"Ihe Siberian tiger is registered in the IUCN Red list: 


b. Mavrikijskij dront izvesten toľko po — izobraZenijam i 
Mauritius dodo known only from drawings and 
pis’mennym istocnikam XVII veka. 
written sources XVII century 


‘The dodo of the Mauritius island is only known from drawings and 
written sources of the XVII century? 


The modified DPs in subject position in (26), similarly to the corresponding 
non-modified versions, denote kinds. However, in comparison to the non-modi- 
fied counterparts (e.g. tigr ‘tiger’), modified kinds (e.g. amurskij tigr ‘Siberian 
tiger) are semantically more restricted. We suggest that modified kinds, com- 
posed by a noun preceded or followed by an adjective within a DP structure, are 
built by applying kind modifiers (of type «e*, t, de. tY) to properties of kinds 
(of type <e*, tY). The formal representation for the modified kind in (26) is given 
in (27). 


(27) a. [ppD[np(A) N (A)]] 
b. [[amurskij tigr] = ixF [(amurskij(tigr))(x*)] 


A question that arises at this point is what kind of adjective can appear in a 
modified kind expression. We think that potentially any adjective can modify a 
kind although the whole expression is subject to an additional pragmatic con- 
straint, known as the well-established kind restriction (cf. Krifka et al. 1995). 

The well-established kind restriction has been widely discussed in the litera- 
ture for English and other languages as applying to definite generics (cf. Vergn- 
aud & Zubizarreta 1992, Krifka et al. 1995, Dayal 2004 and many others). If the 
well-established kind restriction is pragmatic in nature, it is expected that an 
appropriate contextual modification could make a definite kind reading in (28a) 
plausible. This is, indeed, the case. If there are only two relevant classes of tigers, 
wounded tigers and hungry tigers, (28b) becomes a perfectly acceptable char- 
acterization of the first class. In this case, the interpretation that should be at- 
tributed to the subject of (28b) is the one characteristic of a definite kind. 


(28) a. Ranenyj tigr opasen. 
wounded tiger dangerous 


‘A wounded tiger is dangerous: 
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b. Ranenyj tigr, kak vid, opasen. 
wounded tiger as type dangerous 


"Ihe wounded tiger, as a kind, is dangerous: 


We propose that the well-established kind restriction can block a kind inter- 
pretation for modified nominal expressions at a pragmatic level, but this is not 
a grammatical constraint (for similar observations see Dayal 1992; Krifka et al. 
1995: 69; Dayal 2004: footnote 30). Rather, it is our world knowledge and accessi- 
ble encyclopedic information that determines which expression can correspond 
to a known or established kind in the actual world. Note, furthermore, that this 
information can change, and hence, relevant contextual or extra-linguistic fac- 
tors can have a strong influence on the interpretation of nominal expressions. 


4 Conclusions 


In this paper we have provided an analysis of definite kinds in Russian at the 
syntax-semantics interface. We have presented arguments for the semantic def- 
initeness of bare nominal kinds, and syntactic arguments for a null D. We have 
argued that definite kinds are compositionally built by applying the iota operator 
corresponding to a (covert) definite D to the property of kinds denoted by the 
N, and we have extended this analysis to modified definite kinds. The analysis 
we propose applies to one specific type of expressions which refer to kinds, the 
one that corresponds to English definite kinds. In Russian, as in many other lan- 
guages, there is a range of other expressions which plausibly encode D-genericity, 
notably, plural generics. We see it as one of the main questions for future research 
to complement our proposal by an analysis of other types of nominal generics in 
Russian and an account of similarities and differences in the meaning and use of 
various kind referring expressions. 
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Abbreviations 

GEN genitive Masc masculine 
ACC accusative FEM feminine 
OBL oblique NEUT neuter 

SG singular IMP — imperfective 
PL plural 
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Chapter 10 


A morpho-semantic account of weak 
definites and bare institutional singulars 
in English 


Adina Williams 
New York University 


Weak definites in English have been widely studied as an example of when the 
definite article doesn't contribute uniqueness (Aguilar-Guevara & Zwarts 2011; 
Aguilar-Guevara et al. 2014, among others). I take uniqueness to stem from the 
interaction between definiteness and number within the noun phrase. From this 
perspective, weak definites should be seen as a data point situated in the larger 
cross-literature on number. One particular phenomenon from the literature on 
number, the understudied class of the English bare institutional singulars (BISs), 
has been discovered to share several semantic properties with weak definiteness, 
namely number neutrality, referential deficiency, and lexical idiosyncrasy. In this 
chapter, I postulate a shared account of English weak definites and BISs that uti- 
lizes semantic root ambiguity (Rappaport Hovav & Levin 1998; Levinson 2014) as 
a way to account for these facts. This account has syntactic consequences that 
resonate with recent morphosyntactic accounts of number phenomena that argue 
NumP is the host of number interpretation and marking (Ritter 1991; 1992; 1995) in 
languages like Amharic, (Kramer 2009), Halkomelem Salish (Wiltschko 2008), and 
Haitian Creole (Déprez 2005). 


1 Introduction 


Noun phrase constructions called weak definites (Birner & Ward 1994; Poesio 
1994) have been heavily studied in English (Carlson & Sussman 2005; Carlson 
et al. 2006; Aguilar-Guevara & Zwarts 2011; Aguilar-Guevara 2014) and other 
languages (Schwarz 2009; 2013; 2014). They pose a problem for classical accounts 
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of definite noun phrases (Frege 1892; Russell 1905; Hawkins 1978; Sharvy 1980; 
Heim 1982) which require them to be referential and denote unique individuals 
in the discourse, as is evidenced by (1) below. 


(1) Bob went to the store and Mary did too. (Carlson 2006: 19) 
(Different stores OK.) 


(2) Bob is in jail and Fred is too. (Carlson 2006: 18) 
(Different jails OK.) 


Interestingly, English has yet another noun phrase construction - the BARE 
INSTITUTIONAL SINGULAR (BIS), as in (2) - that is not marked for definiteness, 
but shares many semantic properties with the weak definite, including number 
neutrality, diminished referential capacity, and lexical idiosyncrasy. Although 
it has been noted that not all lexical items can participate in weak definite and 
BIS constructions (Carlson 2006; Carlson et al. 2006; Aguilar-Guevara & Zwarts 
2011; Aguilar-Guevara et al. 2014; Aguilar-Guevara & Schulpen 2014), very few 
accounts have used this fact as fundamental in their analysis of weak definites 
(but see Baldwin et al. 2006). In this chapter, I propose a shared account for both 
weak definite and BIS constructions that accounts for both their interpretive sim- 
ilarities and their lexical idiosyncrasy. 

I propose that interpretive similarities between weak definite and BIS construc- 
tions can be derived via root semantic type ambiguity (see Rappaport Hovav & 
Levin 1998), parallel to Levinson (2014) on verbal argument structure alterna- 
tions. The lexical items that can occur in weak definite or BIS constructions have 
a many-to-one mapping between their syntactic roots and potential denotations 
of those roots, unlike most lexical items (e.g. the strong definites!) that have a 
one-to-one mapping. Interestingly, no lexical item can participate in both weak 
definite and BIS constructions, suggesting that, although roots from both classes 
are special in that they are semantically ambiguous, the two subclasses of roots 
are associated with different pairs of possible denotations. Furthermore, the root 
denotation interacts with whether a definite determiner can be merged later in 
the derivation, and determines which of two versions of the determiner can be 
merged. 


1I use the term strong to mean definites that are unique and referring, which is slightly different 
from the use of the term in Schwarz (2009; 2013). 
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I restrict my focus to weak nominal constructions? utilizing directional pred- 
icates with location/institution nouns, because they provide a unique testing 
ground for investigating the relationship between number and definiteness. Rep- 
resentative sentences of the three types are given below in (3-5): 


(3) Ron went to the store. WEAK DEFINITE SINGULAR 
(4) Ron went to school. BARE INSTITUTIONAL SINGULAR 
(5) Ron went to the castle. STRONG DEFINITE SINGULAR 


In my examples, I hold the main verb and preposition constant, because alter- 
ing either has been shown to affect the availability of the number neutral inter- 
pretation (Aguilar-Guevara 2014: 18-19). Although other verbal predicates can 
be used in sentences that get weak readings, I use the light verb to go because 
it is compatible with all three sentence types (3—5). Because of their restricted 
syntactic distribution, weak definites are often cited as having an “idiomatic” fla- 
vor (Nunberg et al. 1994) - a property they share with BISs. I chose to use lexical 
items from the location/institution class of weak definites (Stvan 1998) and BISs, 
because they are the most freely combining (Baldwin et al. 2006), making them 
a good class to work with. 

This chapter is organized as follows. $2 argues in favor of interpretive simi- 
larities between weak definites and BISs. 83 discusses the lexical idiosyncrasy of 
roots that participate in weak definite and BIS constructions. $4 discusses syntac- 
tic consequences of adopting a root semantic type ambiguity account of weak- 
ness in English nominals. $5 provides a morpho-syntactic analysis that builds on 
work on cross-linguistic number that suggests number neutrality has a syntac- 
tic reflex, i.e. a lack of a Num projection (as in languages with general number). I 
also show that the denotation of roots affects which interpretations and syntactic 
structures are possible. Finally, $6 concludes. 


"The term weak definite does not necessarily correspond to a single, uniform class in either the 
syntactic or semantic sense, and thus, different subtypes of weak definites have been given a 
wide range of theoretical and experimental treatments (see, for example, Barker 2005; Klein 
et al. 2009; Aguilar-Guevara & Zwarts 2011; Klein 2011; Aguilar-Guevara & Schulpen 2014; 
Schwarz 2014), and extending this account to other subtypes (e.g. those given in Stvan 1998) is 
left for future work. 
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2 Weak definite singulars and bare institutional singulars 
share semantic properties 


Weak definite singulars and BISs share interpretive similarities with each other, 
to the exclusion of strong, referring definite singulars. There are multiple diag- 
nostics for weakness (see Carlson & Sussman 2005), all of which indicate that 
BISs and weak definites do not have to refer to a singular entity: they can be 
used in contexts where multiple entities can satisfy the descriptive content of 
the definite, they can receive sloppy identity under VP ellipsis, their behavior 
differs from that of referring definites under a type of sluice (under a novel di- 
agnostic test), and they have an impaired ability to antecede pronouns in the 
following discourse. 

Before I present the diagnostic tests, it is important to caution the reader that 
some weak definite Det-N strings are ambiguous between weak and strong in- 
terpretations. Therefore, I use a subset of lexical items for each class of nominals 
to help readers access the appropriate readings throughout this section (these 
lexical items are provided in the footnotes to Table 1 for reference). 


Table 1: Classes of lexical items 


+Definite marked -Definite marked 


Weak interpretation WEAK DEFINITE^ BARE INSTITUTIONAL SINGULAR? 


Strong interpretation STRONG DEFINITE‘ ng 


“Relevant lexical items: e.g. the store, the bank, the hospital (potentially ambiguous between 
weak and strong definite interpretations). 

"Relevant lexical items: e.g. school, church, prison, jail (unambiguously weak). 

‘Relevant lexical items: e.g. the castle, the stadium, the restaurant (unambiguously strong). 

dI assume this cell is empty due to the Blocking Principle discussed in Chierchia (1998: 360), and 
Deal & Nee (2016). The Blocking Principle states that bare nominals cannot be interpreted as 
definite, because there is a lexically specified type shifter present in the language that performs 
this function. 


2.1 Multiple entities satisfying descriptive content 


Weak definites and BISs can be used in contexts where multiple entities satisfy 
the descriptive content of the noun phrase, suggesting that they don’t uniquely 
refer (Carlson & Sussman 2005). In (6-8) below, each of the bolded noun phrases 
fails to require a single unique referent: 
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(6) Don went to the zoo. 


(7) Sue took her nephew to the hospital/the store/the beach. (Carlson et al. 
2006: 2) 


(8) Please take the elevator to the second floor. (Aguilar-Guevara 2014: 14) 


Although the examples above can be used to refer to identifiable, unique ref- 
erents in the discourse, one can also utter (6) in cities where there are multiple 
zoos, (7) in towns where there are multiple hospitals, stores or beaches, and (8) 
when standing before a bay of elevators. Furthermore, weak definites can also 
be used in situations with multiple potential referents in the discourse, allowing 
the weak definite noun phrase to stand for a plurality of entities: 


(9) Context: Ron has been looking for Don, who was supposed to help him set 
up a party, but then went missing for a while. 
Ron: Hey Don! Where have you been? The party starts in an hour! 
Don: I went to the store to buy balloons. I had to go to four of them because 
the first three were all sold out! 


In the mini-discourse in (9), the bolded definite marked noun phrase the store 
does not impose a restriction that there only be a single, unique store in the con- 
text, because immediately following the definite, Don mentions that he went to 
four of them. H the definite noun phrase in (9) did impose this restriction, we 
would predict the mini-discourse to be infelicitious. Similarly, the bare singular, 
as in (10), can also be used felicitiously in situations where multiple entities sat- 
isfy the BIS's descriptive content. 


(10) Context: Ron just met up with Don at their ten-year high school reunion. 
Ron: Hey Don! Wow, you look great! What have you been up to for the last 
ten years? 

Don: Funny you should ask... Actually I went to prison for five years after 
high school. I spent the first three years on Riker's Island, and the last two, in 
Alcatraz. 


Since BISs and singular weak definite noun phrases both lack the uniqueness 
required for strong definite descriptions under this diagnostic, one would hope 
that the two types of weak nominal should have some grammatical similarities. 
Compare the two discourses above with the one below: 


?The interpretation of the following examples is not exhaustive; they are infelicitious in situa- 
tions where there are only e.g. four stores, as in (9). 
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(11) Context: Ron and Don are on a vacation in Britain. They split up for a few 
days and are just meeting up again to continue on their adventure. The 
two had discussed their travel plans before splitting up. 

Ron: Hey Don! How did your weekend go? See anything interesting? 

Don: Yeah, I had a really great weekend. I went to the castle and got some 
great pictures. ??On Saturday, I went to Windsor Castle, then took a train 
over to Dover Castle on Sunday. 


In this case, because Don's response is unnatural, I conclude that the defi- 
nite noun phrase the castle requires a single, unique referent in the discourse. 
The incompatibility of (11) suggests that the lexical item conditions whether the 
uniqueness presupposition is present, since it is unacceptable to use the singular 
definite noun phrase the castle in a context where there are multiple castles. 


2.2 Sloppy readings under VP ellipsis 


Singular weak definites and BISs differ from strong definites in that they do not 
require that the elided noun and the overt one refer to the same exact individual; 
they merely require that the individual(s) they refer to satisfy the descriptive 
content of their shared noun phrase. This loose identity requirement on noun 
phrases under VP ellipsis is called SLOPPY IDENTITY. 


(12) Bob went to the store and Mary did too. (Carlson 2006: 19) 
(Different stores OK.) 


(13) Bob is in jail and Fred is too. (Carlson 2006: 18) 
(Different jails OK.) 


If the noun phrases in the antecedent VP in (12) and (13) are still faithfully 
duplicated in the ellipsis site, then presumably they cannot be strong definite 
noun phrases. Under VP ellipsis, they only need to match in the syntactic material 
that is present. Since the syntactic material present does not introduce a unique 
noun phrase, strict coreference is not required. In other cases, the noun in the 
elided phrase is required to be coreferential with the unique singular individual 
in the antecedent VP, as in (14): 


(14) Ron went to the castle and Don did too. (strong reading only) 
(Must be the same castle.) 


In (14), there is a full strong noun phrase present in the ellipsis site. We only 
get a felicitious interpretation if the overt noun phrase and elided one refer to 


324 


10 A morpho-semantic account of weak definites and BISs 


the same individual. In (15) below, we can see that the store is interpreted as a 
weak definite based on this diagnostic from above: 


(15) Ron went to the store and Don did too. Ron went to Krogers, and Don went 
to Meijers. 


We can see that the store in (15) can be used felicitiously in VP ellipsis contexts, 
where multiple locations satisfy the descriptive content of the noun phrase. 


2.3 Sluicing 


One final diagnostic, which is novel, comes from another ellipsis phenomenon, 
sluicing (Ross 1967; 1969). Sluicing separates strong definites from weak definites 
and BISs, as the latter two are acceptable under a sluice, and the former is not: 


(16) Iknow Ron went to church as a kid, but I don't know which one/church. 
(17) Iknow Don went to the store after work, but I don't know which one/store. 


(18) ?? Iknow Don went to the castle after work, but I don't know which one/cas- 
tle. 


In (18), one must have a referent in mind to felicitiously use the definite marked 
noun phrase, which explains the unnaturalness of the sluice. Since (16) and (17) 
are acceptable under the sluice, one particular referent is not required. Thus, like 
the ellipsis diagnostic above in 82.3, sluicing allows us to argue for the lack of 
referentiality present in weak nominals. 


2.4 Limited capacity to establish discourse referents 


Following Aguilar-Guevara & Zwarts (2011: 182), I note that weak definites and 
BISs have a limited ability to establish discourse referents, which results in them 
being worse than strong definites at anteceding pronominal it. I assume that 
anaphorically linked noun phrases, like it, must match their antecedent in as 
many features (such as number specification and referentiality) as possible. If it 
is taken to be (generally) referring, and specified for singular, then it will have 
trouble matching its features with weak nominals that are neither referring nor 
specified as being singular (see 82.1). If there is only one nominal in the context, 
and it is referential and singular, it can be anaphorically linked to it, as in (19) 
and (20): 


(19) Ron went to the store and Don went to it too. They both went to Krogers. 
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(20) Ron went to the castle and Don went to it too. They both went to Neu- 
schwanstein Castle. 


However, if we have pronominal it - which is referring (in this case), and 
wants to match its number features with its antecedent - in a context with mul- 
tiple potential referents (as in 21), the sentence becomes less felicitious. 


(21) Ron went to the store and Don went to it too. ? Ron went to Krogers, and Don 
went to Meijers. 


Despite the fact that lexical items like store can participate in weak definite 
constructions, by establishing coreference with it in (21), the noun phrase the 
store can only receive a strong, referring interpretation. One way to encode this 
difference would be to say that some singular definite noun phrases (like the store) 
are actually ambiguous between noun phrases that are un-marked for number, 
and those that are marked for singular. In English, these two options will be 
string identical. When a pronoun tries to establish coreference with a definite 
noun phrase that is un-marked for number, the result is degraded, as in (21). 

If pronouns must match features with their antecedents, non-referring noun 
phrases like BISs should not have enough features to match with the pronoun, 
and thus should be even more degraded. This prediction is borne out: 


(22) Don went to church; and Ron went to its, ; too. 


Establishing an anaphoric link with a referring pronoun is less acceptable for 
weak definites, but the BISs are unable to establish coreference with the pronoun 
at all. Therefore, one could assume that there are two missing features that make 
BISs unable to set up coreference, while for weak definites, there is only one (i.e. 
the number feature is missing). I claim that NumP is the crucial projection that 
is missing in both types of weak nominals; see $4 for further discussion. 


2.5 Summary 


In this section, I described the interpretive similarities that weak definites and 
BISs share to the exclusion of strong definites; weak nominals can be used in sit- 
uations where multiple entities satisfy the descriptive content (82.1), can receive 
sloppy readings under VP ellipsis (82.3), are compatible with sluicing (82.3), and 
have limited capacity to establish discourse referents (82.4). 


326 


10 A morpho-semantic account of weak definites and BISs 


3 Lexical idiosyncrasy 


As discussed in the introduction, not all lexical items are equally able to partic- 
ipate in weak constructions (see Table 1). Weak definite and BIS interpretations 
are particularly sensitive to the identity of the lexical item: 


(23) Don went to the zoo/#the conservatory. 
(24) Please take the elevator/#the forklift to the second floor. 
(25) Sue took her nephew to the hospital/#the hospice. 


Even roots with comparable meanings (e.g. hospital and hospice) are unable to 
receive weak interpretations. It has been widely noted that weak interpretations 
for nominals are only available for certain lexical items, but few works other 
than Baldwin et al. (2006) discuss this explicitly. Certain lexical items, e.g. store, 
from the WEAK-STRONG AMBIGUOUS class can be interpreted as weak or as strong, 
while others, e.g. castle, from the STRONG-ONLY class can never be interpreted 
weakly (repeated from above, 12 and 14). 


(26) Ron went to the store and Don did too. 
(Can be the same store.) 


(27) Ron went to the castle and Don did too. 
(Must be the same castle.) 


Because root identity seems to condition whether the weak reading is avail- 
able, perhaps a lexical ambiguity is present. This could mean that there are two 
denotations paired with the root, store, but only one denotation for the root, cas- 
tle. I argue that this lexical ambiguity manifests itself in the semantic type of 
the root (a lá Levinson 2014), as opposed to being a restriction on the type of 
elements that are present in the extension of the noun phrase. 

The choice of root has consequences for the syntax. One piece of evidence 
in favor of a root-level semantic ambiguity that affects syntax is that the weak 
interpretation disappears when the root appears outside of constrained syntactic 
frames compatible with the weak interpretation. For example, store cannot be 
interpreted weakly in subject position:* 


“If the noun is present in the subject position of a “characterizing sentence" in the sense of 
Carlson (1977) and subsequent work, the definite noun phrase can receive a kind interpretation: 


(i) The store is a miraculous and entertaining place to visit. 


I take kind-referring noun phrases to be constructed differently than the definites I account 
for here, and leave an account comparing the two for future work. 
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(28) The store is closed today ( but I don't know which). 
(Must be a strong reading.) 


Similarly, lexical items from the BIS class cannot receive a weak interpretation 
in subject position, see (29). However, when they occur with a definite article, 
they must receive a strong, referring interpretation; the weak interpretation is 
not allowed, see (30): 


(29) School is closed today. 
("School here is a proper name referring to the speaker's school, or to the 
maximal set of all relevant schools.) 


(30) Ron went to the school and Don did too. 
(Must be the same school.) 


Thus, lexical items from each of the three classes can receive a referring in- 
terpretation when they are in definite marked noun phrases, but only a subset 
can receive a weak interpretation when definite marked or bare. Some roots can 
only receive strong interpretations (STRONG ONLY). Some (roots from the WEAK- 
STRONG AMBIGUOUS class) can receive either. Yet, a third class of lexical items 
can be unmarked for plurality or definiteness, and also when they have definite 
marking, they can only receive a strong interpretation (BIS). The behavior of 
these classes of roots is summarized in Table 2.? 


Table 2: Three lexical classes of roots 


STRONG ONLY STRONG-WEAK AMBIG. BIS 


the+NP can be strong Y Y Y 
the+NP can be weak N Y N 
can be bare/incorporated N N Y 


3.1 Root semantic type ambiguity is not homophony 


I've argued that weakness starts at the root as a type difference, which then per- 
colates up to affect higher syntactic projections. However, what sort of semantic 


5A lexical item that cannot get a strong or a weak interpretation, and cannot be bare, is unlikely 
to exist. What would be its distribution? Would it only be present in indefinite noun phrases 
with a? This doesn't seem very plausible. I leave the task of extending my lexical account to 
indefinites to future work. 
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ambiguity do we have in this case? I argue that this is a case of true ambiguity, 
and not simple homophony. Under a homophony account, the roots have no in- 
herent connection to each other. This would mean that we would have two lexical 
items that are both pronounced, e.g. store, and that their interpretive similarity 
is accidental. 

One way to test for homophony was put forth in the general number literature 
(Rullmann & You 2006; Wilhelm 2008). In this diagnostic, homophonous lexical 
items receive parallel interpretations under VP ellipsis. I assume the following 
denotations® for the two homophonous lexical items: 


(31) [penenclosurell :=AX.Penenclosure(X) 
(32) [Penimplement]] :=AX.peNimplement(*) 


(33) Lee saw a pen, and Sam did, too. 
a. Lee sawan animal enclosure and Sam saw an animal enclosure too. 
b. Lee sawa writing implement and Sam saw a writing implement too. 
c. * Lee sawa writing implement and Sam saw an animal enclosure. 


d. * Lee saw an animal enclosure and Lee saw a writing implement. 


In the example above, the word pen must receive the same lexical interpreta- 
tion across the two seeing events; either it always has to be interpreted as an 
animal enclosure (as in 33a, with denotation as in 31), or always interpreted as 
a writing implement (as in 33b, with denotation as in 32). Thus, if singular weak 
definites and BISs were lexically ambiguous, we should not expect them to have 
readings where the number interpretation of the noun phrase differed between 
the main clause and the elided one. However, the two phrases are allowed to 
differ in number interpretation: 


(34) Lee went to the school/school in Boston and Sam did too. 
a. Lee went to only one school/store in Boston and Sam went to only one too. 


b. Lee went to multiple schools/stores in Boston and Sam went to multiple 
too. 


c. Lee went to only one school/store in Boston and Sam went to multiple. 


d. Lee went to multiple schools/stores in Boston and Sam went to only one. 


Type conventions are as follows: x, y, z are from the domain of individuals and are type e; e, 
€", e" are from the domain of events and are type v; m, n are from the domain of numbers and 
are type n; j, k are from the domain of kinds and are type k; type t is for truth values; types 
can be combinatory; P, Q are used for higher types, and their types are specified via subscript. 
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Thus, we can conclude that the ambiguity associated with certain lexical items 
is not an ambiguity in the interpretation of the lexical item that merely prunes the 
elements in the extension. Instead, I argue for a semantic lexical ambiguity that 
affects higher structure (i.e. a type ambiguity), paired with a structural ambiguity 
that is higher. 


3.2 Root denotations for weak definites, BISs, and strong definites 


Now that we know no single lexical root can participate in both weak definite 
and BIS constructions, I postulate semantic types for the three classes of roots. 
Across all classes, roots with type <n, <e, t) are “countable”; and for the strong 
determiner to be present, there must be a countable root present in the tree. This 
accords with the intuition that if one knows the referent of a noun phrase, one 
also knows the number specification of that referent. Otherwise, the weak ver- 
sion of the determiner is inserted, resulting in a weak, non-uniquely referring 
interpretation for the noun phrase. 

Each of the three classes of lexical item has different sets of potential denota- 
tions for their roots; STRONG-ONLY lexical items have only one potential meaning, 
and can only be of type (n, (e, t)), STRONG-WEAK AMBIGUOUS lexical items are se- 
mantically ambiguous and can be of type <n, e, ty, or type <e, t», and BIS lexical 
items can have roots of type <n, <e, t)? or type <k, t». Furthermore, I postulate 
two versions of the definite determiner, one that encodes the "strong", uniquely 
referring interpretation of the definite, and another that does not. 


4 Syntactic consequences of root semantic ambiguity 


The interpretive similarities discussed in $2 align with cross-linguistic analy- 
ses of non-inflectional number phenomena in Haitian Creole (Déprez 2005) and 
Halkomelem Salish (Wiltschko 2008); these accounts argue that these proper- 
ties correspond to number neutrality which is syntactically cashed out as the 
absence of NumP. Additionally, recent work on Russian nominal agreement (Lan- 
dau 2016) also points to NumP as necessary for both cardinality and anaphoricity. 
Bringing together semantic work on definiteness and cross-linguistic work on 
number neutrality, this analysis splits the semantic contribution to definiteness 
across two heads, D and Num, with Num contributing to number interpretation, 
and D contributing referentiality. 

Following this cross-linguistic literature on number, I assume this I assume 
that both weak definites and BISs lack a NumP, which is the projection that con- 
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tributes singular or plural interpretation (Ritter 1991; 1992; 1995). I build towards 
the structures in (35-37), which correspond to (3-5). 


(35) Strong definite (36) Weak definite (37) BIS 
DP DP nP 
ER 
D[+sPEc] NumP D[-spEc] nP n root 
the EEE the FR school 
Num nP D root 
[-PL] m dnm store 
n root 


castle, store, school 


In (35-37), we see that all three classes of roots can appear in the strong con- 
struction (35), but only certain roots can appear in the weak construction (36) and 
the BIS construction (37). This accords with the data provided in 83. Moreover, 
(36) and (37) differ from (35) in that they lack a Num projection. I argue that this 
syntactic difference results from the semantic type of the root. While BIS and the 
weak definite are syntactically similar in lacking a NumP, they differ in whether 
they have a DP layer. This analysis takes BISs to be pseudo-incorporated noun 
phrases, following Carlson (2006: 9-10), who has argued for such an account in 
English and for languages like Greek (Gehrke & Lekakou 2013), as well as Niuean 
and Turkish (Massam 2001; 2009). Thus, weak definites and BISs are both smaller 
than strong, uniquely referring definites; weak definites are missing one projec- 
tion, NumP, while BISs are missing two, NumP and DP. This "small" size interacts 
with an aspect of the interpretation of weak definites and BISs: the so-called se- 
mantic enrichment of weak definites and BISs follows from their super-local rela- 
tionships in a manner that is reminiscent of many idiomatic constructions across 
languages (Marantz 1995). This is discussed in more detail in the next section. 

If the account is correct in correlating root ambiguity with syntactic conse- 
quences, we might expect syntactic structure to affect the weak, number neutral 
interpretation. This prediction is borne out in two ways: changing the morpho- 
logical number marking on these nominals or modifying them with structurally 
high adjectives bleeds the weak number-neutral interpretation. If we assume 
that the locus of number marking and interpretation is NumP (Ritter 1991; 1992; 
1995), then these syntactic effects suggest that this projection cannot be present 
in noun phrases that receive the weak interpretation. Other preliminary evidence 
ofthe importance of NumP for interpretation comes from the domain of semantic 
agreement; Landau (2016) adduces additional evidence that NumP may be an im- 
portant boundary for referential interpretation within the nominal domain from 
Hebrew attributive adjectival agreement. 
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4.1 Enrichment of weak nominals 


Another often discussed fact about weak definites is that they receive semanti- 
cally enriched interpretations. Following Aguilar-Guevara & Zwarts (2011: 182), 
weak definites display "enrichment [that] is stereotypical in the sense that it in- 
vokes the most common circumstances under which the event referred to by the 
sentence could happen". Furthermore, Aguilar-Guevara & Zwarts note that if the 
presence of the weak reading tends to co-occur with the presence of the semantic 
enrichment (below examples copied from Aguilar-Guevara & Zwarts 2011: 182, 
ex. 10b, 11b): 


(38) Lola went to the store. = Lola went to the store. + Lola did shopping. 
(39) ?? Lola went to the store to pick up a friend. 


Under the weak reading, (39) is anomalous, because the stereotypical enrich- 
ment is not present. Like weak definites, BISs require enrichment: 


(40) The janitor went to school. = The janitor went to a school attended it. 


(41) ?? The janitor went to school to clean.’ 


Parallel to (38-39), (40-41) show that the weak reading generally disappears 
when the extra enrichment is blocked. Extra enrichment is reminiscent of id- 
iomatic expressions, where lexical items can get special meanings based on the 
contexts they are found in. Following (Marantz 1997: 208), I take idiomatic in- 
terpretations of lexical items to crucially depend on their local syntactic context. 
Given my claim in earlier sections that weak definites and BISs are syntactically 
smaller than strong definites (see 35-37), the root is closer to the definite or the 
preposition in weak definite and BIS constructions, creating the perfect local en- 
vironment for idiom-like enrichment of meaning. 


4.2 Bleeding weakness 


Now that we have seen some preliminary data compatible with the idea that 
weak nominals (i.e. singular weak definites and BISs) could be analyzed differ- 
ently from their strong counterparts, I motivate my claim that this correlates with 
a syntactic difference at NumP. What evidence can we adduce that strings like 
the store can have weak or strong interpretations depending on whether NumP is 


"This sentence can receive an interpretation that is full referential. Under this interpretation, the 
speaker claims that the janitor is going to the speaker's school to clean. For a similar example 
and more discussion, please see (28). 
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syntactically present? There are a few syntactic tests that suggest the difference 
between weak and strong nominals is below the level of DP. In the rest of this 
section, I discuss two syntactic modifications that block weak interpretations: 
plural marking and modification by high adjectives. 

Following Carlson & Sussman (2005) and Aguilar-Guevara (2014), I use sloppy 
identity under VP ellipsis as the standard accepted diagnostic for weak interpre- 
tations of definites for the remainder of this section. Thus, when I use # for a 
sentence under the weak interpretation, I mean that it cannot be read as sloppy 
under VP ellipsis. 


4.2.1 Plural marking bleeds weak interpretations 


One test for this fact is that changing the apparent number marking on the defi- 
nite description bleeds the weak reading (Aguilar-Guevara 2014: 19):? 


(42) Don went to the bank and Ron did too. Don went to First National Bank, and 
Ron went to CitiBank. (copied from (15) above) 


(43) Don went to the banks and Ron did too. Don went to First National Bank 
and CitiBank, while Don went to Chase and Bank of America. 


(44) Don went to the banks and Ron did too. They both visited First National 
Bank and CitiBank. 


If we compare (42) and (43), the only difference is the plural marking. While 
(42) can receive sloppy readings under ellipsis and patterns as weak nominals do 
with respect to the diagnostics in 82, (44) cannot, because the noun phrase the 
banks must be interpreted as uniquely referring to a salient plurality of banks. 


*Examples of plural-marked weak definites do exist: 


(i) Lola went to the mountains and Alice did too. Lola went to the Alps and Alice visited the 
Appalachians. (Based on Aguilar-Guevara 2014: 20, ex. 42) 


(ii) Ron washed the dishes and Don did too. Ron washed 20 dishes, but Don only washed one. 


Crucially, these readings are also only allowed for certain lexical items. For examples like these, 
I would assume that the plural marker has a different meaning, and perhaps, a different syn- 
tactic height. This is not entirely implausible in light of (i), because one has the intuition that 
the plural marker is talking about a number of mountain peaks which all contribute to a single 
mountain range. One potential way to go would be to follow Kramer (2015) in taking some plu- 
ral markers to be merged low on the little n head, following the intuition that lower projections 
are more likely to get idiosyncratic meaning and condition contextual allosemy (Romanova 
2004; Svenonius 2005; Marantz 2013). 
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In (44), adding plural marking causes the definite-marked noun phrase to lose 
its weak interpretation, and can only be taken to refer to a unique and salient 
plural set of bank locations. If weak readings are derived from kind property- 
denoting roots (i.e. they are not countable), and the addition of NumP requires a 
countable root, then plural marking hosted on NumP will be incompatible with 
weak readings. In (46) below, we have further evidence that adding plural mark- 
ing bleeds the weak interpretation because the enrichment we see with the weak 
interpretation is suddenly no longer available. 


(45) Don went to school and Ron did too. Don went to Pioneer and Ron went to 
Huron. 


(46) Don went to schools and Ron did too. 


For (46) the two boys both physically went to multiple institutions for what- 
ever purpose (i.e. it doesn't have to be to attend school); this is in contrast to 
(45), where the enrichment is present, and each boy had to attend his respective 
school. Thus, if one varies the number specification on the noun phrase in a weak 
definite or BIS construction, the weak reading disappears, as is evidenced by the 
loss of the semantic enrichment. If it is true that number specification falls on 
NumP then adding a NumP bleeds the weak reading. 


4.2.2 High adjectival modification bleeds weak interpretations 


Another source of evidence comes from the fact that certain modifiers can bleed 
the weak interpretations of definite noun phrases (Aguilar-Guevara 2014). Cer- 
tain modifiers (e.g. canonical property adjectives) are base-generated higher (Cin- 
que 2010) than NumP, while others, classificatory or kind-referring ones (e.g. 
noun-noun compounds) are lower (see e.g. Laenzlinger 2005). The height dif- 
ferences between these subtypes of modifiers is straightforwardly visible from 
ordering facts: 


(47) | the expensive grocery store 


(48) * the grocery expensive store 


High modifiers force strong interpretations of definite marked noun phrases, 
suggesting that certain modifiers require countable nominals, while others don't. 


(49) Don went to the [grocery, pet, drug, #good, #red, #expensive] store. 
(Weak reading) 
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(50) Don went to [boarding, nursing, catholic, “good, * red, " expensive] school. 
(Bare institutional singular) 


(51) Don went to the [good, red, expensive] school. 


In (49), the definite noun phrase is unable to receive a weak reading if there is 
a high adjective merged in the DP. Similarly, because BISs are structurally small, 
they also cannot host high modifiers, as in (50). These differences could be cashed 
out as the trees below in (52-54), which build upon (35-37). 


(52) Strong definite (53) Weak definite (54) BIS 
DP DP nP 
ns ‘ p A D[-REF] nP Alow nP 
Anigh NamP the EEE catholic Pa 
expensive ue ër. Alow nP n root 
Num[-pr] nP grocery PAS school 
n root 
Alow nP store 
grocery ZEN: 


n root 
store 


Thus, high adjectives? select for a NumP. The presence of a NumP requires 
that the root be countable (i.e. type <n, <e, t))), and countable roots require that 
the strong D be merged above it (or else there is a type clash). If there is no NumP 
present, a D could be merged or it could not be, depending on the identity of the 
root; this is the distinction between weak interpretations of definites and BISs. 


4.3 Summary 


In sum, this section has argued that a root semantic type ambiguity account has 
several syntactic consequences. Such an account predicts that semantic enrich- 
ment and idioms are similar, based on locality, and that the weak readings can 
be bled by several syntactic alterations within the DP, including plural marking 
and modification by high adjectives. 


?The modifiers that preserve the weak readings, i.e. grocery, pet and drug do not seem to be run- 
of-the-mill modifiers (e.g. it appears that they're nominal and not adjectival). Thus, you could 
say that a syntactically low derivation process like noun-noun compounding could be happen- 
ing here, perhaps at the little n level. Ileave the question of how the syntax of compounding 
interacts with weakness to future work. 
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5 Analysis 


Now that we have determined what sorts of semantic interpretation are required 
for weak readings of noun phrases, and that there are syntactic consequences, 
this section presents a compositional semantic fragment for strong definites, 
weak definites, and BISs, showing how root semantic type interacts with the 
interpretation of the definite article. I lay out my assumptions, then list lexical 
items, and finally provide a working fragment that derives the three separate 
interpretations, based on the syntactic structures I've advocated in $4. 

First, I assume that countable nouns have atoms in their extensions, thus, I 
need to take an atomizer function; I take this one: 


(55)  Arows(x) = {yly s x& Vz < x[z £ y]} (Ouwayda 2014) 


Starting at the root, we need different types of lexical items to capture the 
differences in potential interpretations each lexical item can receive. My three 
classes of roots have the following sets of denotations: 


(56) STRONG-ONLY: e.g. castle, graveyard, stadium, restaurant. 
a. Countable noun: [castle] cn, <e, y := An.Ax.castle(x) & |Aroms(x)| = n 
& Vy € Aroms(x)[castle(y)] 
(57) STRONG-WEAK AMBIGUOUS: e.g. store, bank, hospital. 


a. Countable noun: [store Dun. ve po := An.Ax.store(x) & |Atoms(x)| = n 
& Vy € Aroms(x)[store(y)] 


b. Property: [[store» ln := Ax.store(x) 
(58) BIS: eg. school, jail, prison, church. 
a. Countable noun: [school lun er, := An.Ax.school(x) & |Arows(x)| = 
n & Vy € Atoms(x)[school(y)] 
b. Kind property: [[school; tc := Ak.school(k) 


Next, I assume that the syntax requires a null categorizing head, n, which has 
the denotation of the polymorphic identity function; alternatively, it could have 
no semantic interpretation, and merely be a syntactically (and potentially phono- 
logically) realized functional element. 

Continuing up the tree, the insertion of Num depends on whether the noun 
phrase will be interpreted as plural or singular.!° I assume three potential options. 


This is somewhat similar to Sauerland (2003) in that it assumes a binary specification for num- 
ber, but unlike his system, my denotation for the plural does not include atoms in its extension. 
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If the noun phrase is specified for number, a contentful Num (as in 60 and 61) 
merges, otherwise, no lexical item” will be inserted. 


(59) @:No lexical item inserted 
(60) [Num] id AP(n,de,tyAY-Amny.[P(m)(y) & m > 1] 
(61) [Nump ji] := Aa de Ay-Am.[P(m)(y) & m = 1] 


The choice of which option is possible is determined by the meaning of the 
root. First, if the root is not countable, no Num can be inserted; if it were, there 
would be a type-clash. If the root is countable, a Num is merged,’ and it could 
either be a plural or a singular. 

Finally, a Dj, per] can be inserted, depending on the type. There are two poten- 
tial interpretations for the definite article. The first is roughly Sharvy's (1980) 
denotation for the updated to take a higher type, (n, <e, t)), to account for the 
countability of roots; this dentotation confers referentiality. The second is a kin- 
difying definite article that takes a property and returns its corresponding kind 
if that kind is well-established (see Chierchia 1998 for details): 


(62) [Dua] := Au Aep, ` 3x.3n.Vy[MAx(P)(nY((y) > x = y].ıx.An. 
[max(P)(n)(x)] 

(63) MAX(P)(n) := Ax.P(n)(x)&-3y.[P(n(y)&x < y] 

(64) Df-rer] : = APen AK ^P = k 


If we have a STRONG-WEAK AMBIGUOUS lexical item, (64) will be inserted after 
the little n head, but if we have a BIS lexical item, (64) cannot be inserted or else 
there would be a type clash. 

Next, we merge the preposition. I take prepositions that can facilitate weak 
readings to be ambiguous between normal (e.g. to?) and incorporating variants 


"For the moment, nothing relies on whether no Num is merged or whether a vacuous, or “ex- 
pletive" version is merged, along the lines of Wood (2012), Myler (2014), among others. 

For this work, one could say that Num is privative and has the value PL and it would not affect 
the analysis. In this case, the singular would merely be a Num without any features. In this 
work, I follow Harbour (2007) and others in assuming a binary specification for Num. 

These two denotations for the definite article are not lexically connected under the present 
account. For the moment, these are merely homophones. This is not a desirable result, since 
the intuition is that there is something universally shared between a kindifying definite and a 
regular strong definite. In fact, there is no language known by this author that has a kindifying 
determiner that is not homophonous with the definite article. In future, it would be better to 
find an account which unifies the two, either by constructing one out of the other, or by finding 
a single denotation that can yield both interpretations. 
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(e.g. to5), since the weak interpretation can only occur when the definite is 


in certain syntactic configurations (e.g. when it is the complement of to). I also 
assume, following Aguilar-Guevara (2014), among others, that weak definites do 
not make explicit reference to individual atoms, and take Chierchia's (1998) type- 
shifters, DOWN and UP; DOWN takes one from a property to a kind, while up takes 
one from a kind to a property. 


(65) Troll qv, y : 9 Ax.Ae.GoALle) = x 
(66) [toz] «k, D, <e, (v, YY H AP Ae 3x.3k.[P(K) &' k(x)&GoAL(e) = x] 


The denotation for to, is the classical one for directional prepositions from 
event semantics (see Champollion 2017: 57, for one formulation). The denotation 
for to; is more unique, since it is an incorporating adposition. It takes a kind 
property and tells you that there is a kind that satisfies the property and that one 
of its instantiations is the GOAL of an event. 

The structure of a strong definite such as (5) is exemplified below as in (67). The 
main difference between this singular strong noun phrase and a strong plural one 
would be the specification on NumP: 


(67) PPw, r 
ER 
P DP 
to, nn: 
<e, <v, t>) Di+Rer] NumP,, <e, 2) 
the LI m e EUER 
«n, <e, t», e Num DEREN 
La pu S 
<n, <e, Di <n, <e, t») ny root 
«<n, <e, t), <n, <e, D)  castle/store,/school, 
<n, <e, t) 


We combine the categorizing head with the countable root, which passes up the 
interpretation of the root. Next, we add in the number specification, which re- 
stricts the extension of the noun to singletons. Finally, the type requires that 


“I use the lower types for simplicity, but, if you prefer a continuations-style denotation, the 
preposition could have an additional argument for the main event predicate. This has no con- 
sequences for my account of weakness. 


(i) [t0,,, un Jeun = AP» AxAe.P(e)&GOAL(e) = x 


5 Another potential way to avoid this ambiguity would be to use an explicit incorporating ele- 
ment that constructs to, from to}. 
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we add the updated Sharvy definite (as in 62, and then the regular directional 
preposition, as in 65), resulting in the following derivation: 


(68) In: castle] 
= An.Ay.castle(y)& |Aroms(y)| = n&Vz € Atoms(y)[castle(z)] 
(69) [[Numj ,; n, castle] 
= Am.Ay.castle(y)& |Aroms(y)| = m&Vz € Atoms(y) [castle (z)]&m = 1 


(70) ([D[+spec] Numy-py] ni castle] | 
= 1x.4m.[(castle (x)& |AToms(x)| = m&vz € AToms(x)[ castle (z)] 
&m = 1)&-3y.[castle(y)& |Aroms(y)| = m&vx’ € Atoms(y)[castle (y)] 
&m - 1&x « y] 


(71) [[to; D. cj Nump yj n; castle] 
= Ae.GoAL(e) = ix.3m.[Ccastle (x)& |AToms(x)| = m&vz € Atoms(x)[castle (z)] 
&m = 1)&-3y.[castle (y)& |Aroms(y)| = m&va' € Aroms(y)[castle(y)] 
&m=1&x<y] 


The denotation in (71) gives a set of events whose GOAL is a unique castle. Some 
number of atoms is in the extension of castle and each of them are also castles, 
and their cardinality is one (i.e. there is only one of them). Additionally, it asserts 
that there isn't any other entity (which is a castle that has a number of atoms, 
which are also castles, and whose cardinality is one) that has the original castle as 
one of its proper subparts. This is indeed the interpretation we get for the strong 
definite noun phrase. 

Compared to a strong definite, a weak definite, such as in (3), differs in at 
least two ways. First, the denotation of the root is different, resulting in the weak 
definite article (66) being merged. Second, these two choices conspire to combine 
with the incorporating adposition. These combinations are required based on the 
type of the root. 


(72) PP, 5 
E ee 
P DP. +) 
too 
Kk, t>, <v, t») D[-REF] DÉI 5 
the 
«e, t), <k, t») ng root 
<<e, t>, <e, D? store, 
<e, t) 


(73) [Ina store;]] 
= Ax.store(x) 
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(74) [D[-spec] nj store» | 
= Ak. store = k 


(75) [toz Dy-spec] n2 stores ]] 
= Ae.3y.a3k.[^store = k&"k(y)GoAL(e) = y] 


Finally, we take the BIS, as in (4). Roots that can be bare have the denotation 
of a kind-property (see 58b). This root merges with a categorizing head, which 
passes up the type and denotation of the root, and then with the incorporating 


preposition. 
(76) PPA 5 
Be E 
P OI? b 
to 
«k, D, (v, t) n3 root 
Kk, D, <k, t» school, 


<k, t) 


(77) [ns store; ]] 
= Ak.school (k) 
(78) [[to? Dp sec] n2 schools] 
= Ae.3y.3k.[school (KF)&" k(y)&GoA1(e) = y] 


The derivation for the BIS reflects their similarity with weak definites. More 
specifically, both derivations lack a Num projection, and combine with the incor- 
porating adposition. 


6 Conclusion 


I have argued that weak definites and bare singulars mean similar things (both 
are number neutral), and share comparable morphosyntactic structure (both lack 
a Num projection, and merge with an incorporating adposition). Roots that par- 
ticipate in weak nominal constructions divide into two lexical classes; one partic- 
ipates in weak definite constructions and the other participates in BIS construc- 
tions. These two classes are distinct, with no single lexical item can participate 
in both weak definite and BIS constructions. Lexical items from these classes are 
semantically type ambiguous at the root level, with two denotations each. This 
semantic ambiguity affects whether the root can appear in particular syntactic 
configurations (e.g. whether it requires an overt strong determiner to be merged). 
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Interpretive differences between strong and weak nominals correspond to dif- 
ferences at two syntactic positions: first, at the root-level, semantic type am- 
biguity determines which interpretation(s) is/are possible, and second, at the 
determiner-level, the semantic type of the root conditions which of two versions 
of the definite determiner will be chosen. Using these two ingredients, this ac- 
count explains why weak definites and bare singulars receive number neutral 
interpretations, while simultaneously explaining their lexical idiosyncrasies. 
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We discuss the properties of WEAK DEFINITE noun phrases, definite noun phrases 
(henceforth DP) which do not uniquely refer to an individual referent. Since one of 
the properties of generic noun phrases is that they do not uniquely refer, we asked 
whether weak definites might in fact be a form of generic noun phrase. We adopted 
a quantitative and experimental approach conducting a corpus analysis and four 
experiments that were designed to assess whether weak definites differ from DPs 
that are generic, weak and regular definites. A corpus analysis by de Sá et al. (2016) 
showed that generic DPs and weak definites are not in complementary distribution. 
A follow-up analysis on verb aktionsart showed that most weak definites appear in 
telic or activity DPs. The experiments also compared matched sentences with weak, 
regular and generic reading DPs. These studies do not find similarities between 
weak definites and generics. We conclude that weak definite noun phrases are not 
generics. 
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1 Introduction 


Definite reference has played a central role in linguistics, the philosophy of lan- 
guage and in psycholinguistics (Russell 1905; Strawson 1950; Donnellan 1966; 
Clark & Marshall 1981; Heim 1982; Aguilar-Guevara & Zwarts 2013). Modulo 
some nuanced differences in the treatment of definite reference, there is gen- 
eral agreement that definite noun phrases carry a "familiarity", "uniqueness" or 
“identifiability” condition; the referent of a definite referring expression should 
be uniquely identifiable within a referential domain. In Example (1), the hospital 
denotes only one hospital in the world, being unique, and it is known by the 
interlocutors, being familiar. 


(1) Workers picketed the hospital to protest layoffs. 


However, so-called weak definite! noun phrases (Carlson & Sussman 2005) 
such as the hospital in (2) violate uniqueness: the speaker does not need to have 
any specific hospital in mind when she utters the hospital. Moreover, John and 
Bill could even be going to different hospitals. 


(2 John went to the hospital and so did Bill. 


Itis also known that reference in definite noun phrases can be generic. In those 
cases, the definite noun has uniqueness of a kind, i.e. it denotes a kind, not an 
individual referent. The hospital in (3) is an example, because it does not have an 
unique individual referent, but a kind referent, the hospital is a kind of place. 


(3) In the XVIII century, hygiene rules were introduced into the hospital in the 
Western world. 


For Aguilar-Guevara & Zwarts (2011: 193) weak and generic definites would be 
“different faces of a same phenomenon”, because both of them would have the 
uniqueness of a kind property, denoting a kind. Indeed, if the lack of individual 
reference in weak definites can be reduced to the fact they are generic definites, it 
would be the most straightforward means of accounting for this lack of individual 
reference. 


!Poesio (1994) was the first to use the name weak definites, questioning the Russellian unique- 

ness (1905) and Heim's familiarity (1982). He noted that in sentences like John got these data 
from the student of a linguist there is no need to have familiarity or characterize a single in- 
dividual to the student in order to understand the sentence. He named this class of definites 
weak definites. Carlson & Sussman (2005) adopted the weak definites term, observing that weak 
definites lack uniqueness. 
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The current work does not directly address the specific analysis proposed by 
Aguilar-Guevara & Zwarts (2011). Instead we address the basic question of to 
what extent weak definites share the properties of generic noun phrases and 
regular noun phrases. 

In this chapter, we employ empirical means to evaluate the hypothesis that 
definite generics and weak definites are the same phenomenon. We will examine 
corpus data form Brazilian Portuguese, and experimental data from English to 
evaluate this question. 

We begin with a brief summary of the properties of weak definites. 


2 Weak definites 


The term weak definite noun phrase is used here to describe a certain kind of 
construction that Carlson and collaborators (Carlson & Sussman 2005; Carlson 
et al. 2006; 2013 and Klein et al. 2013) have been working on for some time under 
this designation. The contrasting class of definite noun phrases is called regular 
definites (sometimes "strong definites"), meaning that they trigger the familiar- 
ity/uniqueness presuppositions commonly focused in the literature on definite 
descriptions. The term weak definite noun phrase(s) is often elided to simply weak 
definite(s), but we wish to be clear that we do not use this term in the present 
context to refer to just any noun phrase which, in a language differentiating 
"strong" vs. ^weak" definite article forms, has the definite article in the ^weak" 
form. When we wish to refer to the morphological forms of definite articles, we 
will do so explicitly. 

Besides failing to trigger uniqueness presuppositions, these noun phrases, 
among other properties, must occur in construction with a specific verb or prepo- 
sition, may only occur in the singular form or the plural form but not both, and 
are not subject to restrictive modification.” They appear to have the semantic 
truth-conditions of narrow-scope indefinites, and normally trigger semantically 
"enriching" implications - i.e. there is a non-compositional aspect to their mean- 
ing. Finally, the constructions appear to have a more “eventive” meaning than 
the corresponding compositional constructions, a matter we try to pin down a 
bit more precisely below.? 

Our work was motivated in part by the incorporation hypotheses proposed by 
Carlson and colleagues. Weak definite noun phrases are treated as an incorpo- 


?See Aguilar-Guevara (2014) for insight into the allowable modifiers. 
3The constructions under consideration have a number of characteristics that are summarized 
in Carlson et al. (2006). 
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rated structure by Carlson et al. (2013) and Klein et al. (2013), in which the noun 
phrase and the verb have the semantics of an incorporated event in which the 
article, definite or indefinite, takes scope over the incorporated structure. This 
analysis unifies the observation that weak definites need not uniquely refer and 
the observation that they evoke habitual events associated with the noun. It also 
provides an explanation for the role of the definite article and makes the novel 
prediction that the same noun phrases that can have a weak definite interpre- 
tation can also appear in ^weak indefinite" structures, which are incorporated 
structures that have properties more characteristic of an indefinite than a defi- 
nite. Crucially this approach assumes that weak definites do not have the same 
properties as generic DP. 

In an attempt to better understand the role of the definite article in the de- 
termined phrase and in the incorporated construction, we conducted a corpus 
analysis and a set of experiments that examined whether weak definites exhibit 
properties of generics (83). Then, we report the results of four experiments (84). 


3 Corpus analysis 


In order to observe if weak definites would pattern with generic definites, de Sá 
et al. (2016) analyzed data on a Brazilian Portuguese (BP) corpus. Four-hundred 
occurrences of 31 words, which may present the weak reading in BP (e.g. the 
hospital), were analyzed. They analyzed whether the word was determined by a 
definite article, and if so, whether the DP reading was weak (Carlson & Sussman 
2005), strong - or regular — (Russell 1905), or generic (Carlson 2006). They then 
looked at the distribution of those three kinds of definites. As expected, the regu- 
lar reading is significantly more frequent than the others, 45.6%, but surprisingly, 
according to the categorization criteria, the weak DPs occur significantly more 
often than the generic ones, 33.7% versus 27.5%. 

The authors also described the DP's syntactic function - subject, object, ad- 
junct - for occurrences of weak, regular and generic definites in the corpus anal- 
ysis. The goal was to compare the distributional properties of weak definites, 
generic DPs and regular definites. They evaluated two hypotheses: 


1. If weak definites are in fact generics, then generic DPs and weak definites 
should either occur in the same environments or be in complementary 
distribution with one another, indicating that they are variations of the 
same linguistic type. 
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2. The second hypothesis was motivated by an analysis that weak definites 
undergo semantic incorporation proposed by Carlson et al. (2013). The se- 
mantic incorporation hypothesis predicts that weak definites should occur 
primarily as the object of a verb or a preposition but rarely should occur 
in subject position. 


They found that generics (Figure 1A) are more uniformly distributed between 
subject (25.1%) and object (20.3%), being adjuncts most frequently (54.6%). Reg- 
ular definites showed the same overall pattern (Figure 1B), presenting a signifi- 
cant majority of adjuncts (43.7%), followed by objects (31.3%), and subjects (25%). 
Weak definites presented a different distribution in which they appear as ad- 
juncts (45.7%) as often as objects (46.6%). Weak definites, however, seldom ap- 
pear as subjects. Only 7.2% of the occurrences were as subjects, significantly less 
than the other categories (Figure 1C). 


Generic x Syntactic function Regular x Syntactic function 
60.0 IH Subject as Il Subject 
Il Object BH Object 
45.0 ES Adjunct 420 I Adjunct 


Frequency 96 
w 
S 
o 
Frequency % 
Y 
g 
o 


A a B 
ml 


Syntactic function 


Syntactic function 


Weak x Syntactic function 


IH subject 
IH object 
45.0 IS Adjunct 
& 
> 
E] 30.0 
S 
be 
15.0 C 


Syntactic function 


Figure 1: Definite types and syntatic function - Generic definites (A), 
Regular definites (B) and Weak definites (C) (de Sá et al. 2016: 114, 115) 


The authors argued that the weak definites' high occurrence in adjunct and 
in object position could be interpreted as a reflex of an incorporation process, 
as proposed by Carlson et al. (2013) and Klein et al. (2013). But the fact that this 
kind of definite could also be found in subject position is a problem for the incor- 
poration analysis. The data also did not point to a complementary distribution 
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between weak and generic definites, which could be argued to provide support 
for the claim that they are the same phenomena. 


3.1 Aktionsarten analysis 


As a following analysis to the syntactic function analysis made by de Sá et al. 
(2016), we, with the same tagged corpus, used the verb to analyze the semantics 
ofthe clause in which the definite noun occurred. The verb aktionsarten* was the 
semantic property we focused on motivated by the incorporation analysis, which 
claims that weak definites are incorporated in event or activity verbs (Carlson 
et al. 2013). 

Our hypothesis was that in aktionsart analyses, the semantic incorporation 
hypothesis predicts that weak definites (but not generic DPs) should primarily 
occur with activity and telic verbs, but not with state verbs. We also compared 
weak definites with generics, which are usually found in clauses with state verbs 
(Carlson 2006), to see if there is a complementary distribution between those 
categories. 

For the same 2196 occurrences (of 31 words which could have generic, weak 
and regular readings)? from de Sá et al. (2016) we analyzed the lexical aspect of 
the verb for the clauses containing the definite expression. 

The verbs were classified as state, activity, or telic (achievement and accom- 
plishment), based on Vendler (1957). We classified as state verbs those that do 
not denote an action, for example the verb ter in BP, in the Example (4):° tem 
does not have a process which unfolds during time, it does not denote action 
and if we consider its thematic role, then the subject, the school is not an agent. 


(4) Brazilian Portuguese 


Além do atendimento pedagógico, a escola tem responsabilidades 
Beyond of+the service pedagogical the school has responsibilities 
sociais. 

social 


"Ihe school has social responsibilities, which goes beyond the 
pedagogical service. 


^We analyzed Vendler (1957) aktionsarten's categories: state, activity and telic (achievement and 
accomplishment). 

Extracted from the ptTenTen corpus, in the platform Sketch Engine. See more information in de 
Sá et al. (2016). 

From here until the end of this section all the examples are from our data. 
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The activity verbs are actions which do not need a conclusion point, as the 
verb nadar in Example (5): nadavam is an action that unfolds during time, but it 
does not have a finishing point. 


(5) Brazilian Portuguese 
Os alunos nadavam todo dia na escola. 
The students swam every day in+the school 


"Ihe students swam every day in the school? 


We classified as telic the action verbs that needed a finishing point, as quebrar, 
in Example (6): quebraram is an action that requires a conclusion point. 


(6) Brazilian Portuguese 
Os vândalos quebraram a escola durantea festa. 
The vandals broke the school during the party 


"Vandals broke the school during the party: 


In addition to the notion of aktionsart proposed by Vendler (1957), we used 
the aspectual tests in Dowty (1979) to distinguish one category from another in 
our analysis. As the Dowty tests are proposed for English, we used a version 
proposed by Wachowicz & Foltran (2006) for Brazilian Portuguese. 


3.1.1 Results 


The results are summarized in Table 1 and Figure 2. 


Table 1: Weak and generic definites and aktionsarten corpus occur- 
rence (%) 


Conditions Aktionsarten Corpus occurrence (%) 


Generic State 48.9 
Activity 37 
Telic 14.1 

Weak State 16.6 
Activity 55 
Telic 28.4 
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Figure 2: Aktionsarten occurrence percentage in Generic, Weak and 
Strong conditions 


Weak definites showed a significant difference (y? = 171.6676, df = 2, p < 0.001) 
among state, 16.6%, activity, 55%, and telics, 28.4%, with activity being the most 
frequent category. Generic definites also significantly differ (y? = 85.2335, df = 2, 
p « 0.001) in occurrences of state, 48.9%, telic, 14.1%, and activity, 37%. 

The aktionsarten analysis is consistent with the incorporation hypothesis, in 
that weak definites are more frequent as activity and telic verbs. Also, as expected, 
generics are more frequent as state verbs. One interesting finding is that weak 
and generic definites are not in a complementary distribution. 


3.2 Corpus summary 


The quantitative data presented in this corpus analysis introduces some interest- 
ing evidence about weak definites. Weak definites are more frequent than generic 
definites. Weak definites occur in subject position and they do so less frequently 
than in object or adjunct position. Another interesting fact about syntactic po- 
sition is that there is no complementary distribution between weak and generic 
definites, which would have provided support for the generic hypothesis. 

The analysis of lexical aspect again found no complementary distribution be- 
tween weak and generic definites. Also, as expected by the incorporation hypoth- 
esis, the majority of weak definites occur in activity and telic clauses. 
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4 Experiments 


We conducted four experiments in which we compared participant's produc- 
tion and comprehension for stimuli that were chosen to bias weak, regular and 
generic readings. Our goal was to examine whether weak definites and generics 
exhibited similar properties as would be predicted by the simple version of the 
generic hypothesis. All of the experiments used the same materials, described 
in 84.1. The experiments were conducted in American English, they were pro- 
grammed in JavaScript, and used Amazon Mechanical Turk’ by the software Psi- 
turk? We used the Mechanical Turk platform because it provides easy and fast 
access to participants, data collection is reliable, and results are similar to those 
obtained in laboratory-based experiments (cf. Mason & Suri 2012; Paolacci et al. 
2010). 


4.1 Materials 


The experimental materials were 54 sentences divided in three groups containing 
a noun phrase with a definite article which had: a clear generic reading (Exam- 
ple 7), a clear regular reading (Example 8) and a weak reading (Example 9): 


(7) Henry Ford created the bus in his early years. 
(8) james crashed the bus during the night. 


(9) Linda took the bus to go to college. 


For all sentences, the target noun was presented in a definite noun phrase 
which was an object of a telic verb or an activity verb. In our examples, bus is the 
target word, it is preceded by the, a definite determiner the bus, in object position 
of a telic verb, as created, crashed, took. 

In Example (7) the sentence in the target DP has a prototypical generic reading, 
in which the bus has a kind uniqueness (cf. Carlson & Pelletier 1995; Carlson 
2006). In Example (8), the the bus has a unique referent in the sense of Russell 
(1905). In Example (9), the DP supports a weak definite reading. The weak definite 
sentences were modeled on examples from Carlson & Sussman (2005); Carlson 
et al. (2006; 2013) and Klein et al. (2013). 


7 Access on: https://www.mturk.com/mturk/welcome 
® Access on: https://psiturk.org/ 
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The 54 sentences were divided into 3 lists of 18 sentences, each list with six 
exemplars of each type: regular, generic and weak. The same noun was never 
repeated within a list. The same noun appeared in a different condition in each 
list. Each participant was presented with one of the lists. 

We briefly describe each of the four experiments in the following subsections. 


4.2 Experiment 1: Judgment 


The first experiment used a judgment task in which the participants judged 
whether the DP referred to either an individual or a category. We reasoned that 
regular definite noun phrases would be rated as referring to individuals whereas 
generics would be rated as referring to categories. Finding this pattern would 
provide important evidence that we had successfully created a set of materials 
with regular reference and a set with generic reference. The critical question was 
whether weak definites would pattern with the generics, as suggested by the 
generic hypothesis, or with regular definites. Participants read one sentence on 
each trial and judged if the bold word (the target word in one of the readings) 
was either a CATEGORY or a INDIVIDUAL, using a continuous scale, ranging 
from 0 to 100 with the words INDIVIDUAL and CATEGORY as the endpoints. 
Whether the first endpoint was individual or category was balanced within lists, 
as showed in the Figure 3. 


Linda took the bus to go to college. 
individual z category 
Figure 3: Judgment task screen - Sentence with the word bus to be 


evaluated on a continuous scale (screenshot) 


We expected that the noun with a regular reading would be judged as an indi- 
vidual while the generic would be evaluated as category. This pattern of results 
is necessary to validate the task. The generic hypothesis predicts that the weak 
definites should pattern with the generic definites, as we can see in Table 2. 


Table 2: Judgment task - Hypothesis according to generic theory 


Definite readings Weak - Generic 


Generic Category judgment (uniqueness of a kind) 
Regular Individual judgment (uniqueness) 
Weak Category judgment (uniqueness of a kind) 
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4.2.1 Participants 


90 workers (40 women) from MTurk (https://www.mturk.com/) participated for 
payment of US$0.30. AU participants provided informed consent in this experi- 
ment and in all of the other experiments we report. 


4.2.2 Results 


We analyzed the data using a Linear mixed model fit by REML [‘ImerMod’]. Us- 
ing 0 as the individual endpoint and 100 as the category endpoint, regular defi- 
nites were rated as closest to individual endpoint (mean - 19.82), whereas gener- 
ics were rated as closest to the category endpoint (mean - 80.63). Weak definites 
were rated as closer to the individual endpoint (mean - 34.56). However, they fell 
between the regular and generics (Figure 4). Importantly, weak definites differed 
significantly from both the regular and generic noun phrases (Table 3). 


80 - + 


60 - 


Means 


40- 


B3 
20- d 


' i ' 
generic regular weak 
Conditions 


Figure 4: Judgment task - Judgment means (individual to category) by 
condition 
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Table 3: Judgment task - Statistics - Linear mixed model fit by REML 
[‘ImerMod’] 


Formula: 
ScaledResponse - condition- (1+condition | subject) +(1]|,item), 
Data: 


data, Control: lmerControl (optimizer =“bobyqa”) 


Estimate Std. Error t-value 
(Intercept) 80.561 2.756 29.24 
Regular condition -60.719 4.501 -13.49 
Weak condition -46.133 4.134 -11.16 


The results provide clear evidence that we successfully created two sets of 
sentences using the same nouns, that when used with a definite article in a DP, 
had a regular reading for one set and a generic reading for the second set. This 
serves as important validation for the materials. We also tested the prediction 
that if weak definites are, in fact, generics then they would show the same pat- 
tern. However, the sentences with weak definite noun phrases did not pattern 
with generic noun phrases and they were more similar to regular definite noun 
phrases than they were to generics. We note, however, because judgments of 
weak definites fell between the regular and the generics, one could argue that 
weak and generic definites are not different. One characteristic of noun phrases 
that have weak definite readings is that they can also be interpreted as regular 
definites. Therefore the results for the weak definites could, in principle, reflect 
a mix of regular and generic interpretations. 

One way to assess the mixture possibility is to examine the distribution of re- 
sponses to the three types of stimuli. If weak definites were a mix of regular and 
generics, we might expect to see a bimodal distribution, with an increased num- 
ber of responses near the category endpoint. Figure 5 shows the distributions. 
Inspection of the patterns does not seem to support for the mixture hypothesis. 
Nonetheless this remains a possibility for results in which weak definites are 
intermediate between regulars and generics. 
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Figure 5: Judgment task - Condition histograms: (A) Generic distribu- 
tion, (B) Regular distribution, (C) Weak distribution 


4.3 Experiment 2: Forced choice 


Our second experiment used a forced choice task, in which participants were 
presented with the same sentences as those use in the previous experiment. Par- 
ticipants were asked to choose between two possible noun phrases for a contin- 
uation sentence. One was a noun phrase that was anaphoric with the definite 
noun phrase in the preceding sentence (e.g. That telephone...). The other was a 
noun phrase that would introduce a new referent (e.g. A telephone...) (Figure 6). 


Frank answers the telephone promptly. 


A telephone... That telephone... 


Figure 6: Forced choice task screen 


Our rationale was that regular definites would most likely be interpreted as re- 
ferring to an individual, therefore licensing an anaphoric reference. In contrast 
the kind-reference supported by a generic would be more consistent with a con- 
tinuation that introduced a novel referent. If weak definites are indeed a kind of 
generic, we would expect subjects to choose a new referent more often than the 
anaphoric continuation, i.e. weak definites would behave more like generic ones. 
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4.3.1 Participants 


We again tested 90 workers (34 women) from MTurk for a payment of US$0.30, 
using the same lists as those created for Experiment 1. 


4.3.2 Results 


Figure 7 and Table 4 show the results. As we can observe, in sentences with the 
generic definite participants preferred a new referent (76.7%), while the regular 
reading showed the opposite preference, with 23.4% new referents. The weak 
definite did not pattern with the generic, participants chose a new referent only 
42.9% (Table 5). 

Results confirmed the expected pattern both for the clearly generic and regular 
expressions. Although the weak definites did not pattern with the generics, they 
showed fewer anaphoric choices than regular definites. This is not surprising 
because on the one hand, weak definites do not require a uniquely identifiable 
referent but, on the other hand, a weak definite noun phrase can easily be shifted 
to an interpretation with a uniquely identifiable referent. 

Again however, one could argue that the results for weak definites could re- 
flect a mix of generic and regular definites, In order to provide more nuanced 
evidence that did not require a meta-linguistic judgment with a binary choice, 
we conducted two production experiments. 


4.4 Experiment 3: Free completion 


In this experiment participants generated continuations for the sentences used 
in the previous experiments. No specific constraints were put on the form of the 
continuations except that participants should not use language that would upset 
their grandparents, as in Figure 8. 


The great German composer, Wagner, changed the opera 
for good. 


| Submit 


Figure 8: Free completion task screen 


We analyzed the continuations to see if they repeated the definite expression. 
The logic of the analysis was based on the incorporation hypothesis by Carlson 
et al. (2013) and Klein et al. (2013). If weak definites are indeed part of incorpo- 
rated structures, then the event would be more salient than an individual referent 
would be introduced by a regular definite noun phrase or a kind-reference as in- 
troduced by a generic. 
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Figure 7: Forced choice task - Proportion of NEW by condition 


Table 4: Forced choice task - Proportion of NEW and OLD by condition 


Conditions New (A X...) Old (That X...) 


Generic 0.767 0.233 
Regular 0.234 0.767 
Weak 0.429 0.571 


Table 5: Forced choice task - Generalized linear mixed model fit by 
maximum likelihood (Laplace approximation) [‘glmerMod’] 


Family: 

binomial, ( logit ) 

Formula: 

choice == “New” ~ condition + (1 + condition | subject) + (1 |,item), 
Control: 


glmerControl(optimizer = “bobyqa”) 


Estimate Std. Error z-value Pr(>|z|) 
(Intercept) 1.4719 0.2118 6949 3.67 x 1012 ** 
Regular condition -3.3248 0.3485 -9.540 <2 x10716 ** 


Weak condition -1.9284 0.3130 -6.162 718 x10 10. *** 
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4.4.1 Participants 
90 workers (55 men) from MTurk participated for the payment of US$3.00. 


4.4.2 Results 


The frequency of repetition of the target word (e.g. opera) by condition was eval- 
uated. The continuation in (10) is an example? of a situation which there was no 
target word repetition; the experimental sentence had the target word opera that 
was not used in the completion. 


(10) Experimental sentence: The great German composer, Wagner, changed the 
opera for good. 


Completion: He was a beautiful person. 


We considered as repetition occurrences in which the target word was re- 
peated in a pronoun form, as a DP (any kind of determiner + target word) or 
as a bare noun (only the target word, on in either plural or singular form). In Ex- 
ample (11), the repetition by a pronoun form (i.e. it) can be observed. In Example 
(12), the DP repetition occurred (i.e. the opera). The last example, (13), shows bare 
noun repetition (i.e. operas). 


(11) Experimental sentence: The great German composer Wagner changed the 
opera for good. 
Completion: It is now much better then before. 


(12) Experimental sentence: The great German composer Wagner changed the 
opera for good. 
Completion: The opera is still a noble entertainment today. 


(13) Experimental Sentence: The great German composer Wagner changed the 
opera for good. 


Completion: Many later operas incorporated his changes. 


Table 6 and Figure 9 show that our hypothesis was confirmed, the weak defi- 
nite was significantly less repeated (see Table 7 for stats) than the other definite 
conditions. 


"Al the following examples are from data. 
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Figure 9: Free completion task - Proportion of target word repetition 
by condition 


Table 6: Free completion task - Proportion of target word repetition 
(YES) and no-repetition (NO) by condition 


Condition NO YES 
Generic 0.576 0.424 
Regular 0.750 0.250 
Weak 0.881 0.119 


Table 7: Free completion task - Generalized linear mixed model fit by 
maximum likelihood (Laplace Approximation) [‘glmerMod’] 


Family: 

binomial, ( logit ) 

Formula: 

twr == "y" ~ condition + (1 | subject) + (1 |,item) 
Data: 

datac 

Control: 

glmerControl(optimizer = "bobyqa") 


Estimate Std.Error z-value P(»|z]) 
(Intercept) -2.3662 0.2654  -8.915 <2 x10716 ms 
Weak condition 1.9978 0.3454 5.785 7.27 x107? © 
Regular condition 1.0323 0.3487 2.960 0.00308 = 
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The results showed that the definite noun was more likely to be repeated in a 
continuation for the generic and regular sentences compared to sentences with 
weak interpretations. Unlike the previous studies where the weak definites fall 
somewhere between regular definites, the regular and generics were similar to 
one another with the weak definites showing the fewest repetitions. 

Moreover, when participants chose continuations with repetitions they tended 
to use different morphosyntactic forms and they made different semantic choices. 
As we can see in the occurrence examples below, (14-16), the experimental sen- 
tence has its target word in the generic condition in which opera is a kind. When 
the subjects repeated opera, they used three different morphosyntactic forms, but 
they kept the kind reading. 


(14) Experimental sentence: The great German composer Wagner changed the 
opera for good. 


Completion: It is now much better then before. 


(15) Experimental sentence: The great German composer Wagner changed the 
opera for good. 


Completion: The opera is still a noble entertainment today. 


(16) Experimental sentence: The great German composer Wagner changed the 
opera for good. 
Completion: Many later operas incorporated his changes. 


The morphosyntactic choices for generics was interesting, especially the use 
of bare noun forms, which have a generic reading. The final experiment used a 
forced completion task to investigate the forms that repetition would take. 


4.5 Experiment 4: Forced completion 


Another group of participants was asked to generate completions. In contrast to 
Experiment 3, participants were instructed to repeat the bolded noun used in the 
first sentence. However, they were not given any instructions about the form of 
the repetition. 

The determiner choice (bare, definite, pronoun) was analyzed. We expected 
that, if in the first sentence there was a generic definite expression, then partici- 
pants would be more likely to use the noun in a bare plural expression compared 
to a regular definite. Taken as a whole, the pattern of results from the previous 
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experiments would suggest that weak definites would show similar patterns as 
regular definites, with minimal use of bare nouns. 


In the XVIII century, hygiene rules were introduced 
into the hospital in the western world. 
Submit 


Figure 10: Forced completion task screen 


4.5.1 Participants 


30 workers (16 men) from MTurk participated for the payment of US$3.00. 


4.5.2 Results 


In all conditions the definite article + noun (“dp” in Figure 11) was the most used 
form of repetition, as expected, both because the definite expression was used 
in the first sentence and because it is by most frequent kind of nominal phrase. 
However, bare plurals were sometimes used, but only in the generic condition 
( bp" in Figure 11). In fact it was the the second most preferred repetition form 
for the continuations following generic sentences. Crucially bare plurals were 
never used in continuations that followed weak definites. 
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Figure 11: Forced completion - Types of repetition by condition 
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Below there are some completions and examples of some different morpholog- 
ical forms of repetition founded in our data. Example (17) is a “dp” (the+noun) 
occurrence; example (18) a “bp” (bare plural noun); example (19) an “ad” (noun 
transformed into an adjective); example (20) a *verb" (noun transformed into 
verb). 


(17) Experimental sentence: In the XVIII century hygiene rules were introduced 
into the hospital in the Western world. 
Completion: The hospital was now a clean place. 


(18) Experimental sentence: In the XVIII century hygiene rules were introduced 
into the hospital in the Western world. 
Completion: Hospitals had never understood the importance of cleanliness. 


(19) Experimental sentence: In the XVIII century hygiene rules were introduced 
into the hospital in the Western world. 
Completion: The hospital industry is now one of the largest in the world. 


(20) Experimental sentence: In Medieval times merchants used the bank to 
deposit their credit. 
Completion: Merchants did a lot of banking and made money. 


Also in our data was the “bs” (bare noun singular), as Example (21); the pro- 
noun (it), Example (22); the "ip" (noun determined by an indefinite article), Ex- 
ample (23); the “pdp” (noun determined by a pronoun), Example (24); the “qdp” 
(noun determined by a quantifier), Example (25). 


(21 Experimental sentence: Most song writers use the guitar when writing 
songs. 
Completion: Guitar is the perfect instrument to work out music. 


(22) Experimental sentence: Samuel sold the guitar last year. 


Completion: He didn't want to sell it because it was his favorite guitar but 
he needed the money. 


10 Samuel vendeu a guitarra no ano passado. 
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(23) Experimental sentence: fimi Hendrix played the guitar better than anyone 
else. 
Completion: Nowadays a guitar that was played by him is worth very 
much. 


(24) Experimental sentence: Zack listens to the radio while he drives. 
Completion: His car radio is an aftermarket system. 


(25) Experimental sentence: In the XVIII century hygiene rules were introduced 
into the hospital in the Western world. 
Completion: Every hospital since then uses the same rules. 


The morphosyntactic repetition form was another interesting finding which 
distinguishes weak and generic definites. Bare plural nouns only happened in 
generic condition, behaving differently from weak definites once again. 


4.6 Summary of experimental findings 


In sum, we created a set of materials in which we would compare the properties 
of weak regular and generic sentences with object DP. Experiment 1 established 
that the regular and generic sentences showed the expected properties with reg- 
ulars being judged as being about an individual and the generics as about a cat- 
egory. Ihe weak definites behaved more similarly to the regular definites than 
the generics. In Experiment 2 we found that, as expected, regular definites li- 
censed anaphoric completions, whereas generics encouraged interpretations that 
introduced new events. Again weak definites behaved more similarly to regulars 
compared to generics. Experiment 3 found similar results in a free completion 
task. Finally, Experiment 4 required participants to repeat the noun phrase in 
their completions, the distribution of the completions, suggested that generics 
behaved differently from both regular and weak definites. 


5 Conclusions 


In this chapter we presented new data from a corpus analysis and a set of ex- 
perimental studies that examined properties of weak definites, regular definites 
and generics. The goal of this work was to provide additional evidence that could 
be used to evaluate the hypothesis that weak definite noun phrases are in fact 
generic DP. 
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In a corpus analysis we found that weak definites and generics are not in com- 
plementary distribution in either the syntactic environments in which they ap- 
pear on the semantic types of events as indexed by the verb. Moreover, as pre- 
dicted by the incorporation analysis, the majority of weak definites occurred in 
activity and telic clauses, while generic definites occurred more frequently in 
state and activity clauses. In a set of experiments we first created and validated 
properties of regular, generic and weak definites. We found that for the most 
part, weak definites behaved more like regular definites than generics. We also 
evaluated the possibility that the behavior of weak definites could be accounted 
for by the hypothesis that the behavior of weak definites reflected a mix of trials 
in which the weak definite was given a regular definite interpretation and trials 
in which it was given a generic interpretation. This type of model was, how- 
ever, inconsistent with the results of several of the experiments. In sum, then, 
we found little evidence to support the hypotheses that weak definites showed 
similar properties to generics. 

Our results are consistent with the incorporation hypothesis in that it assumes 
that the non-uniqueness of reference in weak definites does not arise because it 
is a form of generic. Therefore it would have been problematic for the incorpo- 
ration hypothesis if weak definites had, in fact, patterned with generics in our 
studies. Further research will be necessary to determine whether the absence of 
generic-like behavior in these studies would be consistent with the type of anal- 
ysis argued for in Aguilar-Guevara (2014), which accounts for non-uniqueness 
by assuming that weak definites derive their non-uniqueness of individual refer- 
ence by virtue of their generic status and their eventive properties by virtue of 
the KLR rules, described in detail in Aguilar-Guevara (2014). Addressing these 
issues is beyond the scope of the current chapter. 

Although the results we presented and the linguistic phenomena that we dis- 
cussed lead us to conclude that the semantic incorporation hypothesis provides 
an account of the behavior of weak definites without assuming that they are 
generics, it is important to conclude with some caveats. First in the corpus anal- 
ysis weak definites frequently appeared in subject position, which is unexpected 
in the incorporation analysis. Secondly, the conclusions from our experiments 
bring evidence to bear on the two analyses only insofar as we have been able to 
tap into the relevant referential behavior with our tasks. Third, there are prop- 
erties of weak definites, in particular the parallel about restrictions on modifiers 
for weak definites and generics, that receive a straightforward account on the 
generic analysis developed by Aguilar-Guevara (2014), but require additional 
work to be explained by the incorporation analysis. Fourth, the arguments for the 
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role of the definite article depend on the scoping analysis we presented, which 
has some precedents in the literature but is not addressed in these empirical stud- 
ies. If this analysis proves problematic, it will be important to explore other alter- 
natives. Finally, we want to emphasize a point that has emerged from the work 
that the authors have conducted in collaboration with each other and with other 
colleagues. For a phenomenon such as weak definites which involve subtle in- 
teractions between putative structures and conceptual representations, and for 
which the linguistic data is less than definitive, experimental studies that target 
particular hypotheses can prove to be an important complement to linguistic 
argumentation. 
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Abbreviations 

BP Brazilian Portuguese bs bare singular noun 

bp bare plural noun it pronoun it 

DP Definite Phrase ip indefinite article (a/an) + noun 

dp definite article (the) + noun pdp noun determined by a pronoun 

ad ^ noun transformed into an qp noun determined by a quantifier 
adjective verb noun transformed into a verb 
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Most vs. the most in languages where 
the more means most 
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University of Gothenburg 


This paper focuses on languages in which a superlative interpretation is typically 
indicated merely by a combination of a definiteness marker with a comparative 
marker, including French, Spanish, Italian, Romanian, and Greek (DEF+CMP LAN- 
GUAGES). Despite ostensibly using definiteness markers to form the superlative, su- 
perlatives are not always definite-marked in these languages, and the distribution 
of definiteness-marking varies across languages. Constituency structure appears to 
vary across languages as well. To account for these patterns of variation, we iden- 
tify conflicting pressures that all of the languages in consideration may be subject 
to, and suggest that different languages prioritize differently in the resolution of 
these conflicts. What these languages have in common, we suggest, is a mecha- 
nism of Definite Null Instantiation for the degree-type standard argument of the 
comparative. Among the parameters along which languages are proposed to differ 
is the relative importance of marking uniqueness vs. avoiding determiners with 
predicates of entities that are not individuals. 


1 Introduction 


In French, placing a definite article before a comparative adjective, as in (1), suf- 
fices to produce a superlative interpretation: 


more means most. In Ana Aguilar-Guevara, Julia Pozas Loyo & Violeta Vázquez-Rojas 
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(1) Elleestla plus grande. (French) 
she is the cmp tall 


‘She is the tallest. 


French is not alone; other Romance languages, as well as Modern Greek, Mal- 
tese and others, make do with the same limited resources. Some examples are 
given in Table 1.! This paper considers such languages, which we call DEF+CMP 
LANGUAGES, against the background of a growing literature on cross-linguistic 
variation with respect to the relationship between definiteness-marking and the 
interpretation of superlatives. 


Table 1: Comparative and superlative degree of ‘tall’ in selected 
DEF+CMP languages 


LANGUAGE POS CMP SPRL 

English tall taller tallest 

French grande plus grande la plus grande 
Spanish alto más alto el más alto 
Romanian inalt mai inalt cea mai inalt 
Italian alto piü alto il piu alto 
Greek psilós ^ pio psilós o pio psilós 


Greek (alt 2) psilös ^ psilóteros o psilóteros 


When it comes to the superlatives of ordinary gradable adjectives like tall, 
the interpretive contrast of interest is the distinction between so-called absolute 
and relative readings of superlatives in the domain of quality superlatives. In 
Swedish, unlike English, this interpretive distinction is signalled morphologically 
with definiteness: 


Besides Romance languages, languages reported to use this strategy include Modern Standard 
Arabic, Assyrian Neo-Aramaic, Middle Armenian, Modern Greek, Biblical Hebrew, Livonian, 
Maltese, Chalcatongo Mixtec, Papiamentu, Vlach Romani, Russian, and Tamashek (Bobaljik 
2012; Gorshenin 2012). Note however that Gorshenin has rather liberal criteria for a given 
construction being of this type; for Russian, the example given is Etot Zurnal sam-yj interesn- 
yj ‘This magazine is the most interesting (one). Gorshenin (2012: 129) describes sam-yj as 
an "emphatic pronoun" and reasons that "this pronoun indicates uniqueness, particularity of 
the referent in some respect, and therefore it can be regarded as a functional equivalent of a 
determiner in the corresponding superlative construction". 
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(2 a. Gloria sálde god-ast glass. (Swedish) 
Gloria sold delicious-sPRL ice cream 


“Gloria sold the most delicious ice cream: (relative only) 


b. Gloria sálde den god-ast-e glass-en. 
Gloria sold the delicious-sPRL-WK ice cream-DEF 


‘Gloria sold the most delicious ice cream’ (relative or absolute) 


As Teleman et al. (1999) discuss, (2a) means that Gloria sold more delicious ice 
cream than anyone else. It would not suffice for (2a) to be true for there to be a 
salient set of ice creams of which Gloria sold the most delicious. If someone else 
sold that ice cream as well, then (2a) would be false. In contrast, the English gloss 
and the definite-marked example (2b) could be true if both Gloria and someone 
else sold the ice cream that was more delicious than all other ice creams that 
are salient in the context. All that is required for that sentence to be true is that 
Gloria stands in the ‘sold’ relation to the ice cream satisfying that description. 

In Heim's (1999) terms, (2a) has a relative reading (originally called a compar- 
ative reading by Szabolcsi 1986), and (2b), along with the English gloss, is am- 
biguous between a relative reading and an absolute reading. Relative readings 
are typically focus-sensitive, implying a comparison between the focus (e.g. Glo- 
ria) and the focus-alternatives, and on such readings the superlative noun phrase 
behaves like an indefinite despite the frequent presence of a definite determiner 
(Szabolcsi 1986; Coppock & Beaver 2014). On an absolute reading, comparisons 
are made only among elements satisfying the descriptive content of the modified 
noun, and the definite behaves as a definite. The contrast between absolute and 
relative readings was discussed early on by Szabolcsi (1986) with reference to 
Hungarian, and has been taken up in a fair amount of recent cross-linguistic re- 
search, mainly focused on English (Gawron 1995; Heim 1999; Hackl 2000; Sharvit 
& Stateva 2002; Hackl 2009; Teodorescu 2009; Krasikova 2012; Szabolcsi 2012; 
Bumford 2016; Wilson 2016), but also with reference to German (Hackl 2009), 
Swedish (Coppock & Josefson 2015), other Germanic languages (Coppock 2019), 
Hungarian (Farkas & Kiss 2000), Romanian (Teodorescu 2007), Spanish (Rohena- 
Madrazo 2007), Arabic (Hallman 2016), and Slavic languages including Macedo- 
nian, Czech, Serbian/Croatian and Slovenian (Pancheva & Tomaszewicz 2012). 
This paper extends this line of research insofar as it considers the morphosyntac- 
tic realization of both types of readings in DEF+cMP languages. 

The landscape of possible interpretations is slightly different when it comes 
to the superlatives of quantity words, like English much, many, little and few. 
In English, the most has a relative reading (‘more than everybody else’), while 
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bare most has what is called a proportional reading (‘more than half’, roughly). 
In this domain, there is an especially great deal of cross-linguistic variability. As 
Hackl (2009) shows, German die meisten, lit. the most, can be translated into 
English either as most or the most. Even more dramatically, English and Swedish 
are near-opposites with respect to the impact of definiteness-marking on inter- 
pretation (Coppock & Josefson 2015); the definite quantity superlative definite de 
flesta has a proportional reading, corresponding to English most, while the bare 
flest has a relative reading, corresponding to English the most. Coppock (2019) 
shows that every possible correlation between definiteness and interpretation 
is attested among the Germanic languages. So the quantity domain is one that 
appears to be particularly volatile. 

We might expect the landscape of variation with respect to the definiteness- 
marking of superlatives to be rather dull and flat within the realm of DEF+cMP lan- 
guages. If superlatives are formed with definiteness-markers, then definiteness- 
markers should always appear, regardless of what reading is involved. But this 
is not what we find. 

We find in fact several departures from the dull and flat picture one might ex- 
pect. First, as Dobrovie-Sorin & Giurgea (2015) discuss, French is one of the many 
languages of the world where quantity superlatives do not have a proportional 
interpretation. 


(3) ` Detout les enfants de mon Ecole, je suis celui qui joue le plus 
of all thekids of my school,I am the.one who plays DEF CMP 
d'instruments. (French) 
of.instruments 


‘Of all the kids in my school, I'm the one who plays the most 
instruments: 


(4) *Le plus de cygnes sont blancs. (French) 
the more of swans are white 


Intended: ‘Most swans are white? 


Example (3) shows that the quantity superlative le plus can be used with a 
relative interpretation (comparing the speaker to other kids in the school); (4) 
shows that it does not have a proportional interpretation; this example does not 
mean most swans are white'. Such languages are surprising from the perspective 
of Hackl (2000; 2009), according to which the proportional readings of quantity 
superlatives are parallel to absolute readings of quality superlatives. Romanian 
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and Greek are more well-behaved from that perspective; there, the superlative 
of ‘many’ (literally ‘the more many’) can have a proportional interpretation. For 
example, the Greek sentence in (5) is ambiguous as indicated: 


(5) Efaga ta perissotera biskóta. (Greek) 
ate.lsc the much.cMP cookies 


‘I ate the most cookies’ or ‘I ate most of the cookies. 


This is one point of variation. 
Another point of variation is which types of superlatives are accompanied by 
definiteness-marking. We can distinguish between the following types: 


* Quality superlatives 


— Adjectival quality superlatives 
* Predicative, as in She is (the) tallest. 
* Adnominal; absolute reading, as in The tallest girl left. 
* Adnominal; relative reading, e.g. I’m not the one with the thinnest 


waist. 


— Adverbial quality superlatives, as in She runs the fastest. 
* Quantity superlatives 


- Adnominal quantity superlatives 
* Relative reading, as in I ate the most cookies. 


* Proportional reading, as in I ate most of the cookies. 


— Adverbial quantity superlatives, as in She talks the most. 


In French and Romanian, definiteness-marking appears on superlatives of all 
of these types. The same is not the case for Italian, Spanish and Portuguese. 
Despite forming quality superlatives through the combination of a definiteness- 
marker with a comparative form, these languages do not use definiteness-marking 
for adverbial superlatives or quantity superlatives on relative readings (and they 
generally do not allow proportional readings for quantity superlatives at all). Sen- 
tence (6) is an example from Italian (cf. de Boer 1986, Dobrovie-Sorin & Giurgea 
2015, i.a.): 


375 


Elizabeth Coppock & Linnea Strand 


(6) Probabilmentee Hans che ha bevuto piü caffe. (Italian) 
probably itis Hans who has drunk cmp coffee 


‘It is probably Hans who has drunk the most coffee: 


(A comparative interpretation, 'It is probably Hans who has drunk more cof- 
fee’, is also available here, although the cleft construction strongly biases toward 
a superlative interpretation.) The same happens in Spanish and Portuguese. 

In Greek, as illustrated below, there is a split between quantity and quantity ad- 
verbials (talk the most' vs. 'talk the fastest"): quantity adverbials are obligatorily 
definite-marked and quantity adverbials obligatorily lack definiteness-marking. 
All other superlatives have a definiteness marker, relative and proportional read- 
ings of quantity superlatives included. 

So, in all of these languages, superlatives are generally formed by combining a 
definiteness-marker with a comparative, yet in some of these languages, superla- 
tives may lacka definiteness-marker. This is certainly surprising ifthe superlative 
interpretation is supposed to rest fully in the hands of the definite determiner. 

Generally, there are several analytical options we could consider for DEF+CMP 
superlatives. The one we have just ruled out (at least for some of these languages) 
is that the definite article itself is the marker of the superlative. Another is that 
the comparative is lexically ambiguous between a comparative and a superlative. 
Another would build on the stance argued for by Bobaljik (2012), where superla- 
tives are composed of comparatives and a bit that means ‘of all’. This latter piece 
could be taken to be silent in DEF+cmp languages; see Szabolcsi (2012) for a for- 
mal analysis of the more in English along these lines. A fourth possibility is that a 
superlative interpretation arises more or less directly from the composition of a 
comparative meaning and the meaning of the definite article, just as the surface 
form suggests. 

We show that a moderate instantiation of the last-mentioned strategy is vi- 
able, both for DEF+CMP languages and for certain cases in English like the more 
qualified candidate (of the two). In a nutshell, the standard argument of the com- 
parative is saturated by a degree-type pronoun. So the more qualified candidate, 
for example, denotes the candidate in the contextually-given comparison class C 
that is more qualified than contextually-given d, for appropriately chosen value 
of d. This is hypothesized to be possible in all of the languages under considera- 
tion (and even English, manifest in expressions like the taller one of the two). 

This is the common core. But there are conflicting pressures that lead to varia- 
tion with respect to whether definiteness-marking occurs. On the one hand, there 
is pressure to mark uniqueness on phrases where uniqueness can be marked, 
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and on the other hand, there is pressure to avoid definiteness-marking on de- 
scriptions of entities other than individuals. Different languages prioritize differ- 
ently when it comes to resolving these conflicts. We suggest furthermore that 
proportional readings arise through grammaticalization, but via different routes 
for different languages. 

The following sections will present data from Greek, Romanian, French, and 
Ibero-Romance, in that order. These sections will lay out the basic facts concern- 
ing the morphosyntax of superlatives in these languages. After a summary in $5, 
compositional treatments of the various varieties will be sketched in $6. 


2 Greek 


We begin with Greek, where a definite article may combine with either a syn- 
thetic or periphrastic comparative to form the superlative. The synthetic and 
periphrastic variants are in free variation. For example, the comparative form of 
psilós ‘tall’ has two varieties, psilóteros and pio psilós, and these can both com- 
bine with a definite determiner to form a superlative. These two variants appear 
to be freely interchangeable, although the synthetic one may be slightly more 
commonplace. For all of the types of examples we elicited, many of which are 
presented below, both variants were judged to be acceptable. 


Table 2: Declension of the definite article in Greek 


SINGULAR 

MASC. NEUT. FEM. 
NOM o to i 
GEN tou tou tis 
ACC to(n) to ti(n) 
PLURAL 

MASC. NEUT. FEM. 
NOM oi ta oi 
GEN ton ton ton 
ACC tous ta tis 
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2.1 Quality superlatives 


In adnominal superlatives, there is always a definite article, which agrees in gen- 

der and number with the modified noun.? The definite article is present regard- 

less of whether an absolute or relative interpretation is intended. Hence, example 

(7) is ambiguous:? 

(7 O Stellios odigei to pio grigoro aftokinito. 
the Stellios drives DEF CMP fast car 


‘Stellios drives the fastest car. 


Example (8) strongly favors a relative interpretation; definiteness-marking is 
obligatory here as well. 


(8 Den eimai ego afti me ti  leptoteri mesi stin oikogeneia. 
not I self she with DEF thin.cmp middle in family 


Tm not the one with the thinnest waist in the family’ 


Note that the periphrastic variety ti pio lepti mesi ‘the thinnest waist’, lit. ‘the 
more thin waist’, is equally acceptable here according to our consultants. 

Absolute and relative readings of adnominal superlatives are similar to each 
other and to ordinary adjectives with respect to syntactic behavior as well. Greek 
has a much-discussed construction in which the order of the adjective and the 
noun can be reversed called “determiner spreading"; see Alexiadou (2014: 19) for 
an extensive list of references. The interpretive effect of determiner spreading 
is similar to that of placing an adjective postnominally in Romance: generally, 
it is restricted to restrictive modifiers (Alexiadou & Wilder 1998). But unlike in 
Romance, this construction involves an extra definite determiner, as can be seen 
in (9): 


(9 a to kokinoto  podilato 
DEF red DEF bicycle 
'the red bicycle' 
b. to podilatoto kokino 
DEF bicycle DEF red 
‘the red bicycle’ 


"For reference, the inflectional paradigm for the definite article is as in Table 2. We suppress 
the agreement features in our glosses for the sake of readability. 
>Thanks to Haris Themistocleous and Stergios Chatzikyriakidis for judgments and discussion. 
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Determiner spreading can involve superlatives; Alexiadou (2014) discusses the 
example in (10), which has an absolute reading, referring to a particular cat: 


(10) Spania haidevo tin mikroteriti gata. 
seldom pet DEF smallest the cat 


‘I seldom pet the smallest cat! 


Intuitions appear to be somewhat murky when it comes to determiner spread- 
ing with relative readings, but example (11), a variant of (8), was judged as accept- 
able by our consultants: 


(11) Den eimai ego afti me ti leptoteri ti mesi stin oikogeneia. 
not be.isG I she with the thin.cmp DEF waistin family 


Tm not the one with the thinnest waist in the family? 


This evidence suggests that the comparative adjective in an adnominal su- 
perlative may be structurally analogous to an ordinary adjective in a determiner- 
adjective-noun sequences, and that the article is in its ordinary position. 

Adverbial quality superlatives are different, however; they do not involve a 
definite article, as can be seen in (12) and (13): 


(12) I  aderfi mou trechei pio grigora. 
DEF sister my runs CMP fast 


‘My sister runs the fastest: 


(13) Pios tragoudái pio kala? 
who sings more good 


‘Who sings the best?’ (Dobrovie-Sorin & Giurgea 2015: 16, ex. 71) 


Inserting a definite article before pio is not possible in this sentence, e.g. *I 
aderfi mou trechei to pio grigora. As Dobrovie-Sorin & Giurgea (2015) point out, 
this shows that the definite article is not an integral part of superlative-marking 
in Greek. 


2.2 Quantity superlatives 


Like quality superlatives, quantity superlatives are formed though the combi- 
nation of a definite article with a comparative form, which may be either pe- 
riphrastic, as in (14), or synthetic, as in (15). These two examples have relative 
readings. 
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(14) Apóólata  paidiá sto scholeio, egó paízo ta pio pollá órgana. 
of all per kids at school I play DEF cmp many instruments 
‘Of all the kids in my school, I'm the one who plays the most 
instruments: 


(15) Eimai aftos pou pinei ton ligotero kafe. 
I he who drinks Der little.cmp coffee 


‘Tam the one who drinks the least coffee.’ 


Definiteness-marking is not optional here. Note that the word for ‘many’ is 
transparently contained within the superlative phrase in (14). 

Definite-marked quantity superlatives are also regularly used for expressing a 
proportional interpretation. Sentences (16-18) are some examples from our data: 


(16) S-ta  perissotera paidiá sto scholeío mou arései na paízoun mousiki. 
DAT-DEF many.CMP kids at school minelike to play music 


‘Most of the kids in my school like to play music! 


(17) I mama éftiaxe bisköta chthes kai éfagata perissötera. 
the mom made cookies yesterday and ate DEF many.cMP 


‘Mom baked cookies yesterday and I ate most of them! 


(18) Ipia episisto perissötero gála. 
drank also DEF much.cmp milk 


‘I drank most of the milk, too. 


Definiteness-marking is not optional here either. 
Interestingly, there is a contrast between quality and quantity in the adverbial 


domain. Adverbial quantity superlatives appear to require a definite article, as in 
(19):4 


(19) O Pavlos milaeito  ligotero. 
DEF Paul talks per little.cmp 


‘Paul talks the least’ 


"Thanks to a reviewer for pointing this out, and to Stavroula Alexandropoulou for discussion. 
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Removing the definite article in (19) yields a comparative interpretation, ‘Paul 
talks less’. Notice that talk is intransitive, so it is unlikely that to ligoterois serving 
as the object of the verb. Further evidence that the construction in question is 
really adverbial comes from the fact that definite-marked quantity superlatives 
can be coordinated with non-definite-marked adverbial quality superlatives, as 
is the case in (20): 


(20) O Pavlos milaei [pio grigora apo olus ke to perisotero]. 
DEF Paul talks [cmpfast of all.acc and DEF much.cmp] 


‘Paul talks the fastest of all and the most’ 


Thus adverbial quantity superlatives pattern with adnominal quantity superla- 
tives and quality superlatives, and differently from adverbial quality superlatives. 

Although quantity superlatives look morphologically very much like quality 
superlatives, there is a slight difference in their syntactic behavior. Definiteness 
spreading appears to be somewhat less acceptable with quantity superlatives 
than with quality superlatives. None of our consultants were entirely comfort- 
able with examples (21-22) (although they were characterized as “syntactically 
perfect”), and some rejected them: 


(21) a. ?? Efaga ta perissotera ta biskóta. 
ate.lsG DEF much.cMP the cookies 


Intended: ‘I ate the most cookies’ or ‘I ate most of the cookies. 


b. ?? Efaga ta bisköta ta perissotera. 
ate.1sG DEF cookies DEF much.cMP 


Intended: ‘I ate the most cookies’ or ‘I ate most of the cookies: 


(22) a ?? Eimai aftos pou pinei ton ligotero ton kafe. 
be.1sc him who drinks DEF little.cmp DEF coffee 

Tm the one who drinks the least coffee. 
b. ?? Eimai aftos pou pinei ton kafe ton ligotero. 
be.1sc him who drinks DEF coffee DEF little.cmp 


Tm the one who drinks the least coffee: 


So definiteness-spreading appears to be somewhat more restricted in the quan- 
tity domain. 
However, Giannakidou (2004) gives examples such as the following: 
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(23) I perissoterii fitites efygan noris. 
DEF most DEF students left early 


‘Most of the students left early. 


It is unclear to us whether this should be seen as an instance of determiner 
spreading or a construction in which i perissoteri behaves as a quantifier for 
which i fitites serves as the restrictor. According to one native Greek speaker 
we have consulted, the variant in (23) is much better than a version in which the 
noun precedes the quantifier: 


(24) ?I  fitites i perissoteri efygan noris. 
DEF students DEF most left early 


Example (24) is fully acceptable only with comma intonation separating the 
students from the most, and serves as an answer to the question What happened 
with the students?, rather than Who left early? We see an even stronger contrast 
with ligotero ‘less’, which doesn't give rise to proportional readings. 


(25) Ton ligotero ton kafe ton ipia egho. 
DEF less DEF coffee it drink.1sc I 


‘I drink the least coffee: 


(26) * Ton kafe ton ligotero ton ipia egho. 
DEF coffee DEF less it drink.1sc I 


Note that (25) is ungrammatical without the subject pronoun egho, even though 
Greek is normally a pro-drop language; this is presumably because of the require- 
ment of focus for relative readings. 

This evidence suggests that the structure in (23) is not actually a definiteness- 
spreading structure but actually one in which i fitites behaves like a partitive argu- 
ment of i perissoteri. More generally, we take these facts to show that definiteness- 
spreading is not possible with quantity superlatives in Greek. 

To summarize the situation for Greek: definiteness-marking appears with ev- 
ery type of superlative except adverbial quality superlatives. This list includes 
adnominal quality superlatives on both relative and proportional readings, and 
both adnominal and adverbial quantity superlatives. Relative and proportional 
readings are available for adnominal quantity superlatives modifying both mass 
nouns and count nouns. There is also full agreement with the noun in all cases 
where there is a noun to agree with. So quantity superlatives are morphologically 
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very similar to quality superlatives overall. However, quantity superlatives differ 
from quality superlatives with respect to definiteness-spreading, suggesting that 
the two types are not syntactically parallel. 


3 Romanian 


Weturn now to Romanian, which is like Greek is some respects, but not in others. 
It uses DEF+CMP for both relative and proportional readings, but there is evidence 
that the definite article is more tightly knit with the comparative here than it is 
in Greek. 


3.1 Quality superlatives 


Example (27) shows a predicative use of a superlative in Romanian, (28) an at- 
tributive use, and (29) an adverbial use. 


(27) Pentrucá eram cea mai entuziasmatä. 
for  thatLwas DEF cmp enthustiastic 


‘Because I (fem.) was the most enthusiastic. 


(28) A scris cea mai frumoasă compunere. 
has written DEF CMP beautiful composition.Acc 


‘She wrote the most beautiful composition: 


(29) Sora mea poate alerga cel mai repede. 
sister my can run DEF CMP fast 


‘My sister can run the fastest: 


In (27) and (28), cea is a feminine singular form of cel. In (29), we have the 
invariant, default form.” We will not gloss the agreement features, but simply 
refer the reader to the inflectional paradigm for the demonstrative in Table 3, 
taken from Cojocaru (2003: 53). Note also that the adjective frumosá ‘beautiful’ 
shows feminine singular agreement with the noun compunere ‘composition’. 

We gloss cel here as DEF, in order to bring out the parallels with other DEF+CMP 
languages, but it should be kept in mind that this element is not the most direct 
correlate of English the in the language. Cel is not found in ordinary, simple defi- 
nites; instead a suffix is used. For example, in (30a), we have a feminine singular 
definite ending -a, modified from the stem-inherent -a illustrated in (30b). We 
gloss this ending here as DEF as well. 


>Panä Dindelgan (2013: 315) points out that adverbial cel can receive dative case marking, ES 
is not entirely invariable. 
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(30) a. Carte-a e pe mas-a mare. 
map-DEF is on table-DEr big 


"Ihe map is on the big table! 


b. Carte-a e pe o masá mare. 
map-DEF is on a table big 


"Ihe map is on a big table: 


Note also that in traditional grammar (e.g. Cojocaru 2003), cel is classified as 
a demonstrative, though it has additional functions as well. For instance, it can 
double a definite suffix (Alexiadou 2014): 


(31) Legile ^ (cele) importante n'au fost votate. 
laws-DEF (DEF) important have not been voted 


"Ihe laws which were important have not been passed’ 


See Alexiadou (2014: 53-62) for a recent discussion of this phenomenon and 
its relation to Greek determiner spreading. 

As (31) implies, Romanian has two word order options for adjectives, including 
superlatives. This choice bears on the presence or absence of a definite suffix on 
the noun. If the adjective precedes the modified noun as in (28), repeated in (32a), 
this noun remains uninflected. If the noun precedes the adjective, as in (31) and 
(32b), the noun receives definiteness marking (Cojocaru 2003: 53). 


Table 3: Inflectional paradigm for cel in Romanian 


SINGULAR 

MASC., NEUT. FEM. 
N A. cel cea 
G, D. celui celei 
PLURAL 

MASC. FEM., NEUT. 
N A cei cele 
G., D celor celor 


384 


12 Most vs. the most in languages where the more means most 


(32) a. A scris cea mai frumoasă compunere. 
has written DEF CMP beautiful composition.Acc 


‘She wrote the most beautiful composition: 


b. A scris | compunere-a cea mai frumoasă. 
has written composition-DEF DEF CMP beautiful 


‘She wrote the most beautiful composition: 


According to Teodorescu (2007), the prenominal variant (32a) and the post- 
nominal variant (32b) have the same interpretive options. The following is an 
example favoring a relative interpretation; both orders, shown in (33a) and (33b), 
are reportedly fine, although all four of the Romanian speakers we consulted 
spontaneously translated the sentence indicated in the English gloss using the 
prenominal variant (33a). 


(33) a. Eunu sunt cea din familie cu cel mai subtire talie. 
I not be.1sc DEF from family.acc with DEF cmp thin waist 


‘Tam not the one in my family with the thinnest waist: 


b. Eunu sunt cea din familie cu  tali-a cea mai subtire. 
I not be.1sc DEF from family.Acc with waist-DEF DEF CMP thin 


‘Iam not the one in my family with the thinnest waist: 


Note that postnominal adjectives typically receive an intersective interpreta- 
tion (Cornilescu 1992; Marchis & Alexiadou 2009; Teodorescu 2007): 


(34) a. o poveste advarata 
astory true 


‘a story that is true’ (not ‘quite a story’) 


b. o adváratá poveste 
a true story 


‘a story that is true’ or ‘quite a story’ 


c. Această poveste este adváratá. 
this story is true 


‘This story is true? 
The postnominal adjective in (34a) has only the interpretation that the adjec- 


tive in (34c) has, while the prenominal adjective in (34b) can also have a non- 
intersective interpretation. If this applies to superlatives, then the fact that both 


*Thanks to Gianina Iordachioaia for help and discussion. 
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relative and absolute readings of superlatives are possible in post-nominal posi- 
tion suggests that both relative and absolute readings are, or can be, restrictive 
readings. 

Dobrovie-Sorin & Giurgea (2015) give a number of arguments that cel mai + 
AP form a constituent that sits in the specifier of DP. One is the striking fact that 
cel can be preceded by an indefinite article as in (35) (Dobrovie-Sorin & Giurgea 
2015: 15, ex. 64): 


(35) Există întotdeauna un cel mai mic divizor comun a două elemente. 
exists always a DEF CMP small divisor common of two elements 


‘There always exists a smallest common factor of two elements: 


Their second argument is that cel is always present in superlatives, both when 
the superlative is post-nominal as in (32b), and when it is adverbial as in (36). 


(36) Vi fi premiat cel care va scrie #(cel) mai clar. 
will be awarded-prize DEF which will write DEF more clearly 


"Ihe one who writes the most clearly will be awarded a prize: 
(Dobrovie-Sorin & Giurgea 2015: 15, ex. 66) 


Their third argument is that definite comparatives involve the suffix (which 
appears on the adjective preceding the head noun) rather than cel, as in (37): 


(37) ...darcu mult mai difficil-ul obiectiv al... 
... but with much more difficult-the goal of ... 


*... but with the much more difficult goal of ..’ 


So cel must have some meaning or function distinct from the suffix. They also 
observe that the unmarked position of comparatives is postnominal, whereas 
the unmarked position for superlatives is prenominal, and note that cel cannot 
be separated from a prenominal comparative by numerals (though numerals can 
normally follow cel), which can be seen in the contrast between (38a) and (38b): 


(38) a. *cei doi mai înalți munți 
DEF two more high mountains 


b. cei mai inalti doi munti 
DEF more high two mountains 


'the two highest mountains' 
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These arguments have us convinced that cel in superlatives is not a direct de- 
pendent of the modified noun, but rather forms a phrase with the comparative 
marker and the adjective to the exclusion of the noun. So the structure of cea mai 
frumoasá compunere ‘the most beautiful composition’ appears to be: 


(39) 


compunere 


cea mai frumoasá 


3.2 Quantity superlatives 


Now let us turn to quantity superlatives in Romanian. As with quality superla- 
tives, definiteness-marking is ubiquitous, even with adverbials, as in (40): 


(40) Personajele de care se râdea cel mai mult erau Leana si nea 
characters of which they laughed DEF cmp much were Leana and uncle 
Nicu. 
Nicu 


"Ihe characters they laughed at the most were Leana and uncle Nicu’ 


And the DEF+CMP construction can have both proportional and relative read- 
ings in Romanian. Examples (41) and (42) have relative readings (the latter from 
Teodorescu 2007: 11). 


(41) Eusuntcel care canta la cele mai multe instrumente. 
I am the which plays to DEF cmp much instruments 


‘Iam the one who plays the most instruments. 


(42) Dana bäut cea mai multä bere. 
Dan has drunk DEF cmp much beer 


‘Dan drank the most beer 


Example (43) is a case with a proportional reading, using the partitive prepo- 
sition dintre:’ 


"The preposition dintre (din with singular complements) is used in Romanian to introduce an 
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(43) Cele mai multe dintre copiii care merge la scoala mea place sa se 
DEF CMP much of kids.DEF who go at school mine like to REFL 
Joace muzica. 
play music 


‘Most of the kids who go to my school like to play music’ 
We also find non-partitive uses as in (44) and (45): 


(44) Cei mai mulți elevi din clasa meaau  plecat devreme. 
DEF CMP many students from class.the my haveleft early 


‘Most of the students in my class have left early. 


(45) Cele mai multe lebede sunt albe. 
DEF CMP many swans are white 


‘Most swans are white. 


But the syntactic position of the superlative phrase may not be the same as 
with quality superlatives: in contrast to quality superlatives, quantity superla- 
tives are normally only permitted prenominally (Teodorescu 2007: 11), as exam- 
ple (46) shows. 


(46) "Dana băut bere-a cea mai multă. 
Dan has drunk beer-DEF DEF CMP much 


Intended: "Dan drank the most beer: 


Dobrovie-Sorin (2015) does give the example of a postnominal cel mai mult- 
construction in (47a) and (47b), but says that it does not give rise to a relative or 
proportional reading, but "comparison between predefined groups", where the 
noun phrase refers to one of these groups. 


(47 a. Cele mai multe lebede sunt albe. 
DEF CMP many swans are white 


‘Most swans are white. 


explicit comparison class in superlative constructions, e.g. El scrie cel mai bine dintre toti, "He 
writes the best of all’, lit. "He writes the more good among all’ (Cojocaru 2003: 169). Dintre is 
also used in quantificational partitive constructions, e.g. Unul dintre ei prezintä proiectul ‘One 
of them is presenting the project”. 
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b. ? Lebedele cele mai multe sunt albe. 
SWans.DEF DEF CMP many are white 


"Ihe more/most numerous (group of) swans are white: 


This reading is referential, and distinct from the proportional reading that 
arises in prenominal position, rather than quantificational. 

Interestingly, (42) above does not have a proportional interpretation. Accord- 
ing to Dobrovie-Sorin (2015), this is tied to the fact that a mass noun is involved. 
Indeed, in our data, a proportional interpretation, in the case of mass quantifica- 
tion (shown in 48 and 49), typically involves a ‘majority’ or ‘part’ noun instead, 
just as in other Romance languages: 


(48 Am baut majoritatea laptelui. 
have drunk majority milk 


‘I drank most of the milk? 


(49) Am baut mai mare partea laptelui. 
have drunk cmp big part GEN milk 


‘I drank most of the milk’ 


Dobrovie-Sorin argues that cel mai mult functions as a complex proportional 
quantifier, one that expects a count down denotation as an argument. Provid- 
ing further evidence for this view, she claims that a proportional reading is not 
always available for count nouns, either, pointing to a contrast in acceptability 
between (50) and (51): 


(50 Cei mai multi elevi din clasa meaau  plecat devreme. 
DEF CMP many students.DEF of class.DEF my have left early. 


‘Most students in my class left early” (Dobrovie-Sorin 2015: 395) 


(51) * Cei mai multi bäieti s-au adunat ín sala asta. 
DEF CMP many boys REFL-have gathered in room.DEr this. 


‘Most of the boys have gathered in this room’ (Dobrovie-Sorin 2015: 
395) 


She ascribes these differences to whether or not the nuclear scope is filled 
with a distributive predicate. The unacceptability of (51) is explained under the 
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assumption that the subject noun phrase is quantificational rather than referen- 
tial. This adds to the evidence in favor of Dobrovie-Sorin's (2015) idea that cel 
mai mult has grammaticalized as a proportional determiner. 

To summarize: superlatives are always definite in Romanian. Evidence involv- 
ing quality superlatives suggests that the definite element is integrated more 
closely with the comparative element than with the modified noun, i.e. lower 
down in the structure, not signalling definiteness at the level of the full nominal. 
Both relative and proportional readings are available for adnominal quantity su- 
perlatives, although the proportional readings are limited to count nouns. The 
existence of proportional readings only with count nouns as well as the unac- 
ceptability of collective predicates suggests that cel mai mult has grammatical- 
ized into a proportional determiner (Dobrovie-Sorin 2015). 


4 Ibero-Romance 


4.1 Ouality superlatives 


Predicative adjectival superlatives in Italian, as in (52), and Spanish, as in (53), 
normally involve a definite article: 


(52) Carlaé la piu intelligente di tutte queste studentesse. (Italian) 
Carla is DEF cMP intelligent ofall these students 


‘Carla is the most intelligent of all these students. (de Boer 1986: 53) 


(53) Ese carroesel mejor. (Spanish) 
that car is DEF better 


“That car is the best. (Rohena-Madrazo 2007: 1) 


One exception, as illustrated in (54), is noted by de Boer (1986: 53), who gives 
the following predicative example without definiteness-marking. 


(54) il giornoincui il nostro lavoro era piu faticoso (Italian) 
DEF day in which DEF our work was cmp tiresome 


‘the day on which our work was most tiresome’ 


Here, even though the example is grammatically predicative, it has the flavor 
of a relative reading, comparing days rather than alternatives to the subject of 
the sentence il nostro lavoro our work’. The same example in French, shown in 
(55), involves a definite article (Alexandre Cremers, p.c.): 
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(55) le jouroü notre travail était le plus fatiguant (French) 
DEF day when our work was DEF CMP tiresome 


'the day on which our work was most tiresome' 


Matushansky (2008a: 75) reports a similar phenomenon in Spanish presented 
in examples (56) and (57): 


(56 la que es más alta (Spanish) 
DEF who is cmp tall 


‘the one who is tallest’ 


(57) la que está más enojada (Spanish) 
DEF Whois cmp annoyed 


'the one who is most annoyed' 


In both these examples and in the Italian example (54), uniqueness is indicated 
with the help ofa relative clause. These patterns suggest that superlatives require 
marking of uniqueness in some fashion, not necessarily with an accompanying 
definite article. 

Asin French, adnominal superlatives can appear both pre- and post-nominally 
in Italian, as the reader can see in (58a) and (58b): 


(58) a. La mamma fa i biscotti più buoni del mondo. (Italian) 
DEF mom makes DEF cookies cMP tasty of.DEF world 


‘Mom bakes the yummiest cookies in the whole world. 


b. La mamma fa i piu buoni biscotti del ^ mondo. 
DEF mom makes DEF CMP tasty cookies of.DEF world 


Normally, there is no definite article on a postnominal superlative in Italian, 
although Plank (2003) reports that both variants in (59a) and (59b) are acceptable, 
the latter "putting greater emphasis on the adjective": 


(59 a. l'uomo piu forte (Italian) 
DEF’man more strong 
'the stronger / strongest man' 
b. l'uomo il piu forte 
DEF man the more strong 


‘the strongest man’ 
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Example (60) displays a postnominal superlative in Italian with a relative read- 
ing; here again there is no definite article:? 


(60) a. Non sono quello con il girovita piu sottile in famiglia. (Italian) 
not am the.one with DEF waist cmpthin in family 


Tm not the one with the thinnest waist in the family’ 


b. #Non sono quello con il piu sottile girovita in famiglia. 
not am the.one with DEF cmp thin waist in family 


Adverbial quality superlatives systematically lack definiteness-marking in Ital- 
ian, as shown in example (61) from de Boer (1986: 53): 


(61) Ditutte queste ragazze, Marisa lavora piu diligentemente. (Italian) 
ofall these kids Marisa works cmp diligently 


‘Of all these kids, Marisa works the most diligently, 
The same holds in Spanish: 


(62) Juanesel que corre más rápido. (Spanish) 
Juan is DEF who runs cmp fast 


‘Joan is the one who runs the fastest’ (Rohena-Madrazo 2007: 1-2) 


As Rohena-Madrazo (2007) notes, the relative clause in (62) is necessary in or- 
der for a superlative interpretation to arise. Example (63) has only a comparative 
interpretation: 


(63) juan corre más rápido. (Spanish) 
Juan runs cmp fast 


‘Joan runs faster: 


Thus a superlative interpretation does not freely arise on its own here; unique- 
ness must somehow be signaled in the absence of a determiner. 


? According to Cinque (2010: 11-12), only the postnominal syntax is possible on relative read- 
ings. Here is a speculation as to how one might explain this in semantic/pragmatic terms: the 
prenominal position is normally hostile to non-restrictive modifiers in Italian (e.g. "la presenza 
mera vs. la mera presenza ‘the mere presence’). Matushansky (2008b) proposes that the modi- 
fied noun saturates the comparison class argument of a superlative, so that a superlative mod- 
ifier combines with the noun via Functional Application rather than Predicate Modification. 
This kind of analysis would yield an absolute reading; suppose this is how absolute readings 
arise. Then absolute readings would be non-restrictive and relative readings would be restric- 
tive. Placing a superlative postnominally could then serve as an indication that an absolute 
reading is not intended. 
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4.2 Quantity superlatives 


Naturally, we expect the definite article to mark the superlative degree with quan- 
tity superlatives as it does with quality superlatives. However, the definite article 
is sometimes absent even in superlative constructions. De Boer (1986: 53) gives 
the example in (64); our informants consistently gave us translations like that in 
(65) and (66) for sentences involving relative readings: 


(64) Dei nostriamici Luigi é quello che ha piu soldi. (Italian) 
of.DEF our friends Luigi is the.one who has cmp money 


‘Of our friends, Luigi is the one who has the most money. 


(65) Ma probabilmente d Hans che ha bevuto piu caffe. (Italian) 
But probably it.is Hans who has drunk cmp coffee 


‘But it is probably Hans who has drunk the most coffee 


(66) Dituttii ^ ragazzi della mia scuola io sono quello che suona piu 
of all DEFkids in.DEF my schooll am the.one that plays CMP 
strumenti. (Italian) 
instruments 


‘Of all the kids in my school, I’m the one who plays the most instruments. 


Hence there is no overt morphological distinction between 'more coffee' and 
“most coffee’. 

Following Bosque & Brucart (1991), Rohena-Madrazo (2007) uses comparative 
and superlative “codas” to distinguish between comparative and superlative in- 
terpretations in Spanish, as in (67) and (68) respectively: 


(67) el niño más rápido (que todos nosotros) (Spanish) 
DEF boy cmp fast (thanall we) 


‘the boy faster (than all of us)’ 


(68) el niño más rápido (de todos nosotros) (Spanish) 
DEF boy cmp fast (ofall we) 


'the fastest boy (of all of us)’ 
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In (67), the boy is among ‘us’, but not in (68). Using this technique, he shows 
that so-called "free" superlatives in Spanish, as shown in (69), can be fronted 
before the verb, but comparatives cannot:” 


(69) Juanesel nino que más libros leyó (de/*que todos ellos). (Spanish) 
John is DEF boy that cmp books read (of/*than all them) 


‘Juan is the boy that read the most books (of/*than all of them): 


This evidence suggests that the comparative and the superlative interpreta- 
tions are really distinct. 

Similarly, the most instruments in ‘I’m the one who plays the most instruments’ 
and the most coffee in 'Hans has drunk the most coffee' are translated without 
definiteness-marking in other Ibero-Romance languages, as we can see in the 
sets of examples in (70) and (71): 


(70) a. Yosoyelque toca más instrumentos. (Spanish) 
b. Eu sou o que toca mais instrumentos. (Portuguese) 
C. jo sóc qui toca més instruments. (Catalan) 


‘Iam the one who play the most instruments: 


(71 a. Hans es el que ha bebido más café. (Spanish) 
b. Hans quem bebeu mais café. (Portuguese) 
c. Hans és probablement qui ha begut més cafe. (Catalan) 


*Hans is the one who has drunk the most coffee: 


Adverbial quantity superlatives also lack definiteness-marking, as (72) and (73) 
show: 


(72) ..unoche lavora piu di tuttie parla meno di tutti. (Italian) 
... one who works cmp of all and speaks little.cmp of all 


*... one who works most of all and speaks least of all’ 


?"Free superlatives” include adverbial superlatives like más rápido ‘the fastest’ and quantity 


superlatives like más libros ‘the most book’. In contrast, “incorporated superlatives” such 
as el nifio más rápido ‘the fastest boy’ are defined as being contained within an NP. The 
free/incorporated distinction in Spanish happens to draw a line between adnominal quality 
superlatives on the one hand and quantity and adverbial superlatives on the other. 
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(73) Alberto esel que trabaja más. (Spanish) 
Alberto is DEF that works cmp 


"Alberto is the one who works the most: 


Unlike in French and Romanian, a definite article would be ungrammatical 
preceding the comparative word here. Rather, adverbial quantity superlatives 
the pattern of adnominal quantity superlatives here (as in all of the languages 
under consideration, in fact). 

The DEF+CMP construction is generally not used to express proportional read- 
ings. Proportional most is generally translated using other types of constructions, 
such as ‘the greater part’ in (74): 


(74) Alla maggior parte dei bambini nella mia scuola piace suonare. 
of.DEF big.cMP part of.DEF kids in my schoollike play 


‘Most of the kids in my school like to play (music). (Italian) 


The same holds for the entire Ibero-Romance subfamily, as far as we can see, 
including Spanish, Portuguese, and Catalan. For example, most of the kids in Most 
of the kids in my school like to play music is translated using a majority noun in 
these languages, as can be seen in (75): 


(75) a. La mayoría de los niños... (Spanish) 
b. A maioria das criangas... (Portuguese) 
c. La majoria dels nens... (Catalan) 
“Most of the kids... 


However, according to Dobrovie-Sorin & Giurgea (2015: 20), "Italian allows 
the article and a proportional meaning in the partitive construction": 


(76) Il piu degli uomini predicano ciascuno la sua benignità. (Italian) 
the more of.DEF men preach each ` the his kindness 


‘Most men preach their own kindness’ 


Dobrovie-Sorin & Giurgea (2015: 21) also write that this is possible with no 
overt partitive complement. 


(77) Gli ospiti sono partiti. I più erano già stanchi. (Italian) 
DEF guests have left DEF CMP were already tired 
"Ihe guests left. Most (of them) were already tired. 
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This shows that to the extent that proportional readings for quantity superla- 
tives are allowed in Italian, they are signalled with the definite article. In this 
respect, Italian is like Swedish: definite for proportional and non-definite for rel- 
ative. But this construction appears more restricted than Swedish de flesta ‘most’, 
given that it can only occur with partitive complements. Our Spanish and French 
informants do not accept the DEF+CMP construction in the same environment, so 
this appears to be specific to Italian among the Ibero-Romance languages. 

To summarize: Italian and other Ibero-Romance languages use definiteness- 
marking for adnominal quality superlatives, and ordinary predicative quality su- 
perlatives, but not quantity superlatives, adverbial superlatives, or predicative 
quality superlatives embedded in phrases uniquely characterizing a given dis- 
coursediscourse referent. Proportional readings are generally not available for 
quantity superlatives, with the exception of il più in Italian accompanied by a 
partitive complement. 


5 Summary 


Table 4 gives a summary of the definiteness-marking patterns we have observed. 
For a set of languages in which superlatives are formed with the help of a defi- 
nite article, there is a remarkable diversity of definiteness-marking patterns on 
superlatives. 


Table 4: Definiteness-marking in superlatives in DEF+cMP languages 


Greek Romanian French Italian Spanish 


Qual./pred. + + 
Qual./pred. (rel. clause) 
Qual./prenom. 
Qual./postnom. 
Qual./adv. 

Quant./prop. 
Quant./rel. 

Quant./adv. 


* * 


+ + + + 


l 
+ + + + + 


+ + + t+ + + t+ + 


+ + + 


The contrasts raise a number of questions, including: 


e Why do quantity superlatives in Ibero-Romance lack definiteness-marking, 
in contrast to Greek, Romanian, and French? 
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* Why are adverbial superlatives marked definite in French and Romanian, 
but not Italian, and why is there a split among adverbial superlatives in 
Greek? 


e Why is definiteness-marking absent on predicative superlatives in relative 
clauses in Italian, but not in French? 


e Why do Greek and Romanian allow proportional readings for DEF+CMP 
but not Spanish or French, and why is it limited to partitive environments 
in Italian? 


We cannot address all of these issues adequately here. However, we will sug- 
gest a certain perspective that may bring some of this apparent chaos to order. 

The perspective is as follows. The variety of different definiteness-marking 
patterns we see suggests that the grammars of these languages may be pulled 
by a number of competing pressures. One pressure is to mark uniqueness of a 
description overtly. Another pressure, we suggest, is to avoid combining a defi- 
nite determiner with a predicate of entities other than individuals, such as events 
or degrees. In conjunction with certain additional assumptions regarding the se- 
mantics of various types of superlatives, these pressures result in a dispreference 
for certain patterns. These assumptions are made explicit in the following section. 


6 Formal analyses 


6.1 Ouality superlatives 
6.1.1 Prenominal quality superlatives 


To derive a superlative meaning for DEF+CMP constructions, let us start with the 
assumption that the basic meaning for a comparative like Greek pio is a func- 
tion from measure functions to degrees to individuals to truth values, roughly 
following Kennedy (2009), Alrenga et al. (2012), and Dunbar & Wellwood (2016), 
among others.!" 


This presentation glosses over the fact that not all comparatives are alike. An illustration of 
this point of particular relevance to the case at hand are the detailed studies of comparison in 
Greek by Merchant (2009; 2012), where there are three morphosyntactic strategies for mark- 
ing the standard: (i) the preposition apo 'from' introducing a phrasal standard; (ii) a genitive 
case marker, also introducing a phrasal standard; and (iii) a complex standard marker ap-oti 
'from-wh' which introduces both reduced and unreduced clausal standards. Merchant (2012) 
concludes that if all of the work is to be done by the comparative, then three different lexical 
entries for the comparative are needed. But there is hope for a unified analysis; the two phrasal 
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(78) pio~ AgAdAx.g(x) > d 


In (78), g denotes a measure function, a function that maps individuals to de- 
grees. A gradable adjective like long is assumed to denote such a function H Mod- 
ulo lambda-conversion, this yields the translation in (79) for pio grigoro ‘faster’: 


(79) pio grigoro ^ AdAx.rast(x) > d 


The next ingredient is a meaning shift that we refer to as Definite Null Instan- 
tiation, in homage to Fillmore (1986), as defined in (80). It takes any function and 
saturates its argument with an unbound variable." 


(80) Definite Null Instantiation (Meaning Shift) 
If a ~ a’, and a’ is an expression of type (oc, 7), then a ~> a’(v) as well, 
where v is an otherwise unused variable of type c. 


Applying this gives (81), where d is an unbound degree-type variable: 
(81) pio grigoro (after DNI) ~> Ax.FAST(x) > d 


We have written d in bold-face in order to draw attention to the fact that it is 
unbound. (We could of course have chosen a variable other than d; all we needed 
was a degree variable that is not otherwise used.) This description can combine 
with a noun like aftokinito ‘car’ using Predicate Modification to produce (82): 


comparatives differ only in the order in which they take their arguments, and Kennedy (2009) 
shows that one of the phrasal meanings can be derived from the clausal meaning. Moreover, 
Alrenga et al. (2012) offer a new perspective on the division of labor between the comparative 
and the standard marker, allowing for a unified view on the comparative morpheme across 
these constructions, with differences attributed to the standard markers. They use a lexical 
entry like (78) for the comparative, and clausal and phrasal standard markers each combine 
with it appropriately in their own way. In light of this work, we may continue to operate 
under the assumption that (78) constitutes a viable candidate for a unified treatment of the 
comparative morpheme across different types of constructions and across the languages un- 
der consideration. 

"The arrow ~ signifies a translation relation from a natural language expression (part of an LF 
representation) to an expression of a typed extensional language; we thus adopt an "indirect 
interpretation" framework, in which expressions of natural language are translated to a formal 
representation language. Within this framework we assume the standard rule of Functional 
Application: 


(i) Functional Application (Composition Rule) 
Ifa ~ a’ and f. ~ ß’, and a’ is of type (c, v) and p’ is of type c, and y is a phrase 
whose only constituents are « and f, then y ^» a' (f^). 


?Note that this meaning shift depends on the assumption that the ~> relation is not a function; 
a given natural language expression can have multiple translations into the formal language 
and they need not be equivalent. See Partee & Rooth (1983) for precedent for this assumption. 
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(82) [pio grigoro] aftokinito ~> Ax.FAST(x) > d ^ CAR(x) 


If there is a unique fastest car, then there will be a way of choosing a value for 
d in such a way that this description picks it out. Hence, given an appropriate 
choice of value d, the definite article should be able to combine with this descrip- 
tion to pick out the most qualified candidate. Normally, the range of potential 
referents will be limited to a class C, which we may suppose is referenced by the 
definite determiner, as displayed in (83). 


(83) to^» An. 1%. P(x) ^ C(x) 


Where 7 is a variable over types, constrained in specific ways by different 
languages. Applied to pio grigoro aftokinito, this denotes the unique car in C that 
is faster than d. The structure of the derivation is the one in (84). 


(84) e 


tr, t), T) <e, t 
| (by Predicate Modification) 


to qux TERSA 


<e, t) <e, D 
M dE 
Kr, d), <T, D) (ed)  aftokinito 


foni e 


<d, «T, d>, <T, t>) , 
grigoro 


pio 


This clearly gives an absolute superlative reading. What about relative read- 
ings such as (8), with ti leptoteri mesi ‘the thinnest waist’? The analytical land- 
scape is quite different under the assumption that there is no superlative mor- 
pheme. One influential analysis of the absolute vs. relative distinction, due to 
Szabolcsi (1986) and developed in Heim (1999), holds that relative readings arise 
through movement of -est at LF to a position adjacent to the constituent of the 
sentence corresponding to one of the elements being compared, typically the fo- 
cus. With no -est to undergo movement, this analytical route is not available to 
us. 
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A prominent class of alternatives to the movement view is that -est remains 
in situ, the absolute vs. relative contrast resulting from different settings of the 
comparison class (Gawron 1995; Farkas & Kiss 2000; Sharvit & Stateva 2002; 
Gutiérrez-Rexach 2006; Teodorescu 2009; Pancheva & Tomaszewicz 2012; Cop- 
pock & Beaver 2014; Coppock & Josefson 2015). This type of approach is more 
amenable to the assumptions that we have made here. Although we have no 
superlative morpheme to provide a comparison class, the definite article is re- 
stricted to a contextually-determined domain C, and the contrast could concern 
the value of that contextually-set variable. On an relative reading of the fastest 
car, for example, C might consist of cars standing in a salient correspondence 
relation to the focus alternatives. 

Heim (1999) notes that so-called "upstairs de dicto" readings pose a challenge 
for the in situ approach. The problem is that John wants to climb the highest 
mountain can be true in a context where there is no specific mountain that John 
wants to climb, nor does John's desire pertain to the relative heights of moun- 
tains climbed by various competitors; it just so happens that he wants to climb a 
5000 mountain (any such mountain), and the ambitions of the others in the con- 
text with respect to the heights of mountains they want to climb are not so great. 
This reading can be obtained by scoping just -est over the intensional verb want. 
Such a reading is apparently available in at least Greek and French, according to 
our informants. 

Various responses to that challenge have been offered. Sharvit & Stateva (2002) 
offer an in situ theory designed to handle these readings, but it relies on a non- 
standard definite determiner, so that solution is not directly compatible with our 
analysis. Solomon (2011) points out that upstairs de dicto readings can be handled 
if the comparison class is thought to be a set of degrees rather than individuals. 
This is more amenable to the assumptions we have made, and would only require 
us to allow for the possibility that the definite article combine directly with a d- 
saturated version of cmp that compares degrees rather than individuals and serve 
to pick out a specific degree. 

Other routes may be compatible with the analysis as it stands. Coppock & 
Beaver (2014) argue that the "upstairs de dicto" phenomenon is part of a more 
general phenomenon that requires an explanation anyway, namely cases like 
Adrian wants to buy a jacket like Malte's, discussed by Fodor (1970) and in much 
subsequent literature under the heading of “Fodor’s puzzle”. If indeed upstairs 
de dicto readings can be seen as an instance of Fodor's puzzle, then the problem 
can be explained away. Another alternative is offered by Bumford (2016), who 
posits a sort of definiteness that is subordinated to the modal element. Although 
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Bumford's theory of the definite article is different from the simple one we have 
sketched here, his suggested approach for dealing with intensional contexts may 
be viable even in the context of a more standard analysis. In any case, we believe 
it is an open question whether upstairs de dicto readings can indeed be managed 
in the context of an in situ approach using the sort of approach to the definite 
article that we have taken here, and the success of our analysis in dealing with 
them depends on a general solution to this problem. 

Another fact to be accounted for is the fact that, as Szabolcsi (1986) pointed 
out, superlatives on relative readings behave like indefinites, suggesting that they 
are, in Coppock & Beaver's (2015) terms, indeterminate. We refer to Coppock & 
Beaver (2014) for ideas on how to capture the indeterminacy of relative readings 
in the context of an in situ analysis. 

Another question that this proposal raises is how to rule out overt standard 
phrases with comparatives that combine with definite articles. These are entirely 
ungrammatical: 


(85) *Elleestla plus belle que (Marie, j'ai imaginé]. (French) 
she is the cmp beautiful than (Marie, I've imagined] 


The same is true for definite comparatives in English, as Lerner & Pinkal (1995) 
observe: 


(86) ^ George owns the faster car (than Bill). 


Lerner & Pinkal (1995) also observe that this is part of a larger pattern, where 
weak determiners allow overt standard arguments and strong determiners disal- 
low them: 


(87) | George owns a/some/a few faster cars) than Bill. 


(88) * George owns every/most faster car(s) than Bill. 


Beil (1997) offers an explanation of this contrast on the basis of the fact that 
strong determiners have a domain that has to be presupposed in previous con- 
text. Xiang (2005) offers an alternative explanation, on which strong quantifiers 
induce an LF intervention effect blocking the movement that the than phrase 
needs to undergo. This idea is quite compatible with the present analysis. In a 
case where Definite Null Instantiation has applied, the target of comparison does 
not need to undergo movement, so no intervention effect is predicted to arise. 


401 


Elizabeth Coppock & Linnea Strand 


6.1.2 Postnominal quality superlatives 


In all of the languages we have seen, there are constructions in which the su- 
perlative occurs post-nominally; (89-92) are some examples repeated from the 
discussions above. 


(89) Spania haidevo tin mikroteriti gata. (Greek) 
seldom pet DEF smallest DEF cat 


‘I seldom pet the smallest cat: 


(90) A scris | compunere-a cea mai frumoasă. (Romanian) 
has written composition-DEF DEF CMP beautiful 


‘She wrote the most beautiful composition: 


(91) celui dela famille avec la taillela plus fine (French) 
the.one of the family with the waist DEF cMP fine 


‘the one in the family with the thinnest waist. 


(92) La mamma fa i biscotti più buoni del ` mondo. (Italian) 
DEF mom makes DEF cookies cMP tasty of.DEF world 


‘Mom bakes the yummiest cookies in the whole world. 


In Greek, Romanian and French, the postnominal superlative is accompanied 
by a second definiteness-marker (this is specific to superlatives only in Romanian 
and French). For such cases, it is convenient to adopt Coppock & Beaver's (2015) 
predicative treatment of the definite article, whereby it denotes a function from 
predicates to predicates, presupposing uniqueness but not existence. It is also 
important for our purposes to restrict the domain of a definite determiner to a 
salient comparison class C. Thus we adopt the lexical entry shown in (93) for 
Romanian cel, for example. 


(93) celc ~ APAx.o(|Pn C| < 1) ^ P(x) ^ C(x) 


(Here ð is the ‘partial’ operator, whose scope is presupposed material. It eval- 
uates to the 'undefined' truth value unless its scope is true.) With this, we derive 
the interpretation in (94) for the superlative phrase in (90): 


(94) celc mai frumoasá 
^ Ax .e(|Ax' . BEAUTIFUL(x’) > d^ C(x)| < 1) A BEAUTIFUL(x) > d^ C(x) 
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This description characterizes a composition x in C that is the only one whose 
beauty exceeds d. Combining this phrase with the definite article on the noun 
yields a derivation of the following form for the the full noun phrase (we assume 
that the suffix -a in compunere-a ‘the composition’ is interpreted in D, and we 
represent it in 95 as an iota operator for simplicity, although it can also be given 
a treatment along the lines of 93): 


(95) e 
p D 
At hao <e, t) 
MET dO 
-a (eb <e, t> 


compunere — celc mai frumoasă 


6.2 Quantity superlatives 


The picture is much richer when it comes to quantity superlatives. In all of the 
languages we have considered, quantity superlatives differ at least to some ex- 
tent from quality superlatives, if not with respect definiteness-marking (as in 
Italian) then with respect to definiteness-spreading in object position (Greek), 
use of a pseudopartitive construction (French), or pre- vs. postnominal word or- 
der (Romanian). We therefore posit that quantity superlatives are of a different 
semantic type from quality superlatives (across the board), namely: predicates of 
degrees, rather than individuals. We have adopted a measure function approach 
to the semantics of gradable predicates, so that an adjective like tall for example 
is translated as an expression of type <e, d}, mapping an individual to a degree. 
The parallel treatment for a quantity word like much or many would then be 
(d, d); just as tall maps an individual to its height, much maps a quantity to its 
magnitude. The magnitude of a quantity might as well be seen as the quantity 
itself, so we will simply treat quantity words as identity functions on degrees. 
Thus for Greek, we have (96) and (97): 


(96) pollá — Ad.d 


(97) pio pollá (after DNI) > Ad’ .d' > d 


Now, we cannot use Predicate Modification to combine with the noun (and 
this predicts that definiteness spreading should be problematic.) Let us assume 


403 


Elizabeth Coppock & Linnea Strand 


that what happens instead is that the degree predicate is linked to the nominal 
predicate by the same glue that holds a pseudopartitive together. We implement 
this with the composition rule called Measure Identification in (98). The result is 
a predicate that holds of some individual x if the nominal predicate holds of x 
and x has an extensive measure satisfying the degree predicate. 


(98) Measure Identification (Composition Rule) 
If y is a subtree whose only two immediate subtrees are a and f, and 
a ~> D, where D is of type (d, t), and P ~> P, where P is of type <r, t), 
where r is any type, then 


y ~> Av. D(ui(v)) ^ P(v) 


where v is a variable of type t and ju; is a free variable over measure 
functions (type <r, d)). 


We use ji; to denote a contextually-salient measure function along the lines of 
Wellwood (2014), with i as a free variable index presumed to be constrained by 
context. So given a predicate of degrees D and a predicate of individuals P, this 
operation yields Ax . D(u;(x)) ^ P(x). (99) is an example (assuming the plural is 
translated using the cumulativity operator *; cf. Link 1983): 


(99) pio polla órgana ~> Ax . p;(x) > d ^ "INSTRUMENT(X) 


This is the right sort of thing to combine with a definite article as long as 
d is chosen appropriately. The definite article introduces a comparison class C. 
So ta pio pollá órgana will be predicted to denote the plurality of instruments 
in C whose contextually-relevant extensive measure is d. The structure of the 
derivation is thus as in (100): 
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(100) e 


t, t), T) <e, t> 
| (by Measure Identification) 


i LETT T 
(d, t) (e, t) 


a dE NN sn 


Kr, d», <T, D) <d,d)  órgana 


fi FR 


(d, KT, d), <T, t>? : 
| pollá 
pio 
In Romanian, the definite element cel forms a constituent with the comparative 


element and the quantity word to the exclusion of the noun. We therefore posit 
the structure in (101) for the semantic derivation: 


(101) <e, t) 
(by Measure Identification) 


E ic 


(d, t) <e, t) 
E c pan. 
€, D, Gr, Di (d,t) instrumente 


| P Sen 
cele «rt, dy, <T, D) (d, d) 
fox MP 
(d, <T, d», <T, t») | 


multe 


mai 


The meaning for this expression as a whole characterizes a plurality of instru- 
ments whose measure is greatest among any of the degrees in the context. In 
the case of a relative reading, the set of degrees that are salient in the context are 
aligned in a one-to-one relationship with some salient set of individuals, typically 
those individuals that are alternatives to the focused constituent. 
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French has yet a different structure, involving a pseudopartitive, as illustrated 
in (102). 


(102) Jesuiscelui qui joue le plus d'instruments. (French) 
I am the-one who plays DEF CMP of-instruments 


‘I am the one who plays the most instruments. 


Since French does not use a word for many parallel to Greek pollá or Romanian 
mult, we might posit either a silent underlying form with the same meaning, or 
we might imagine that French simply makes do without such an element. In the 
latter case, it is convenient to treat plus using the simplest imaginable lexical 
entry for comparison (Heim 2006; Beck 2010), namely (103): 


(103) plus ^» Ad Adr .d' >d 


Given this, we have the derivation in (104): 


(104) <e, D 


BR E ee 


d (d, <e, t>) 


AN V AE. 
«nb, — 4b — eb de (eb 
| Tos ME e 


le <d, <d, t de instruments 


plus 


We assume that the Meas head acts as glue, linking the degree denoted by le 
plus with the denotation of the noun phrase such that the noun phrase is con- 
strained to have an extensive measure of that degree. The resulting denotation 
is just the same as that posited for Romanian. 

Finally, we come to Italian, which has the simplest overt form, as shown in 
(66) above, repeated here as (105): 


(105) ...che suona più strumenti. (Italian) 
... that plays cmp instruments 


*... who plays the most instruments. 
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One possible analysis is the one in (106), using a lexical entry for piü like the 
one given for French plus above. 


(106) (e, t) 
(by Measure Identification) 


T di 


(d, t) <e, t) 
| a > 
«d, dt) ` 

strumenti 
Z X 
piu 


The predicate that this derives holds of any plurality of instruments x whose 
quantity exceeds d. This of course does not necessitate that there be no larger 
plurality of instruments in the context, so we have not captured a superlative 
interpretation. Assuming the same analysis carries over to Spanish, it remains 
an open question why superlatives undergo fronting and comparatives do not. 


6.3 Adverbial superlatives 


For adverbial quantity superlatives, we start with the assumption that a verb 
phrase denotes a property of events, translating to an expression of type (v, t), 
and that the DEF+CMP construction combines with it via Measure Identification. 
For example, in Greek we have (107): 


407 


Elizabeth Coppock & Linnea Strand 


(107) (v, D 
(by Measure Identification) 
P di 
(v, D (d, t) 
p ud c 
VP «rt, t), <T, t» (d, t) 
prd c 
i KT, d), <T, t») (d, d) 
Nox: 
(d, «<T, d), <T, t») | 
polla 


pio 


Adverbial quality superlatives, on the other hand, involve gradable predicates 
that measure events as in (108): 


(108) (v, D 

(by Predicate Modification) 

D EE c 

(v, t) (v, t) 
n i 
VP «rt, d», <T, t) (v, d) 

fom 
(d, «<r, d), <T, Dn | 


pio 


grigora 


We suggest that this difference in type underlies the contrast between quantity 
and quality adverbial superlatives in Greek: the Greek definite determiner applies 
to predicates of type (d, t) but not ones of type (v, t. In Italian, neither type of 
adverbial superlative is marked definite; this can be understood as an aversion 
to definiteness-marking on predicates of both types. In French and Romanian, 
on the other hand, both types are definite, and this can be understood under the 
lens of a maximally polymorphic definite determiner. 
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6.4 Proportional readings 


Proportional readings for quantity superlatives are not fully available in French, 
Spanish, or Italian, but they are available in Greek and Romanian. From a larger 
typological perspective, Greek and Romanian are the odd ones out; most lan- 
guages lack proportional readings for the superlative of ‘many’ (Coppock et al. 
2017). In line with Coppock et al. (in prep), we suggest that this is related to 
our proposal that quantity words typically denote predicates of degrees rather 
than individuals, and their comparatives likewise compare degrees rather than 
individuals. A definite determiner that combines directly with the comparative 
of a quantity word after Definite Null Instantiation produces a phrase denoting 
a degree or amount that is greatest among some contextually-salient set of de- 
grees. Thus for example le plus in le plus d'instruments would a denotation like 
‘the greatest number’ or ‘the greatest amount’. Notice that the phrase the greatest 
number only has a relative reading. Consider (109): 


(109) Maria has visited the greatest number of continents. 


This cannot mean that Maria has visited more than half of the continents. If le 
plus means the same thing as the greatest number, then it, too, should only have 
relative readings. According to Coppock et al. (in prep), the reason that such 
cases have only relative readings is related to a general constraint on the inter- 
pretation of superlatives. This view makes a distinction in principle between the 
entities that are actually measured by the gradable predicate to which superlative 
morphology attaches, the measured entities, and what they call the contrast set, 
following Coppock & Beaver (2014). On relative readings, the contrast set and 
the measured entities are distinct and related by a salient association relation 
given by the sentence. On absolute readings, they are conflated. Coppock et al. 
(in prep) posit a constraint on the contrast set, according to which it must con- 
sist of individuals. When the gradable predicate measures degrees rather than 
individuals, the contrast set must be distinct from the set of measured entities; 
hence a relative reading is forced. 

How, then, do proportional readings arise? Dobrovie-Sorin & Giurgea (2015) 
suggest that they arise through grammaticalization, which requires full gram- 
matical agreement (present in both Greek and Romanian), and is preempted by 
the pseudopartitive construction that French uses with relative readings. On this 
perspective, it is a matter of historical accident whether a given language has 
developed a proportional determiner from a quantity superlative. We are sym- 
pathetic to this view. We would only note that if indeed Greek and Romanian 
involve different constituency relations when it comes to relative readings, as 
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suggested above, then the putative grammaticalization process must be of a dif- 
ferent nature for the two languages. We would like to suggest that in Greek, 
proportional readings arise through a process similar to the one envisioned by 
Hoeksema (1983), where the quantity word comes to denote a gradable predicate 
of (plural) individuals, and the comparison class for the superlative is constituted 
by two non-overlapping pluralities, one consisting of atoms that satisfy the pred- 
icate in question and one consisting of atoms that do not. Such an analysis is 
consonant with the idea that the definite determiner is in its ordinary position in 
Greek, rather than more tightly integrated with the comparative marker. In Ro- 
manian, on the other hand, there is a constituent containing the definite article, 
the comparative marker, and the quantity word; this phrase could potentially be 
reanalyzed as a complex determiner. 


7 Conclusion and outlook 


We have suggested that superlative interpretations arise in DEF+CMP languages 
with the help of an interpretive process called Definite Null Instantiation for 
the target argument of a comparative. It is reasonable to ask whether this pro- 
cess is restricted to DEF+CMP languages or available more broadly. We suggest 
that it is available at least somewhat more broadly, and that English is one of 
the languages that avails itself of it, in constructions like the taller of the two 
(discussed from a formal semantic perspective by Szabolcsi 2012). Why English 
doesn't generally form superlatives using this strategy could be explained in 
terms of markedness; since there is a dedicated superlative morpheme in En- 
glish, it should be used whenever the comparison class contains more than two 
members. 

The pattern of variation suggests that a number of competing pressures are 
at play. One pressure is to mark uniqueness of a description overtly. Another 
pressure is to avoid combining a definite determiner with a predicate of entities 
other than individuals, such as events or degrees. We have assumed that qual- 
ity adverbs denote gradable predicates of events, and that quantity words denote 
predicates of degrees. The pressure to avoid combining definite determiners with 
predicates of events rules out definiteness-marking on adverbial quality superla- 
tives, and similarly for predicates of degrees and quantity superlatives. 

In Optimality Theoretic terms, we might conceive of these forces as constraints 
that we could label *DEr/d (“do not use a definite determiner with a predicate of 
degrees"), “DEF/v (“do not use a definite determiner with a predicate of events") 
and MARK-UNIQUENESS. Italian ranks the former two over the latter: 
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*DEF/d, "DEF/vV > MARK-UNIQUENESS 
while French ranks the latter over the former two: 
MARK-UNIQUENESS > "DEF/d, "DEF/v 


An adverbial superlative like le moins fort (French, lit. ‘the less fast’) violates 
*DEF/v but not MARK-UNIQUENESS, while one like más rápido (Spanish, lit. ‘more 
fast’) violates MARK-UNIQUENESS but not “DEF/v. Greek draws the line at adver- 
bial quality superlatives, which suggests that it ranks MARK-UNIQUENESS Over 
*DEF/v, but not over *DEF/d: 


*DEF/d > MARK-UNIQUENESS > "DEF/V 


Intuitively, MARK-UNIQUENESS should require that any descriptive phrase which 
is presupposed to apply to at most one individual is marked with a lexical item 
that conventionally signals this presupposition. But there may be slightly differ- 
ent shades of this constraint for different languages. Recall that in Italian (and 
Spanish), the definite article is normally used in predicative superlatives, pre- 
sumably to distinguish between the comparative and the superlative interpreta- 
tions. But the relative clause construction serves to mark uniqueness in some 
sense, rendering the definite article unnecessary. This sort of explanation could 
be made more precise by imagining a version of the MARK-UNIQUENESS constraint 
in Ibero-Romance that imposes slightly different requirements. Suppose that in 
Ibero-Romance, the operative MARK-UNIQUENESS constraint may be satisfied in 
some cases where a candidate phrase with unique descriptive content is not ac- 
tually marked as unique, as long as it is embedded in a larger phrase with unique 
descriptive content which is. So Ibero-Romance might have a “once per discourse 
referent" rule, while French might have a “once per phrase” rule. Syntactic restric- 
tions would presumably also come into play. 

This hypothesized difference could also apply to bare postnominal superla- 
tives, which are found in Italian but not French. This idea would have to be evalu- 
ated in light of previous ideas regarding this contrast. According to Kayne (2008), 
the reason has to do with the licensing of bare nouns in general. Alexiadou (2014: 
74—75) suggests an approach appealing to the richness of agreement features. 
Matushansky (2008a) argues that superlatives are always attributive modifiers 
of nouns, so a nominal structure is projected around a superlative in the post- 
nominal case; perhaps Italian does not do that. We leave it to future research to 
compare among these possible explanations for the difference. 
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Future research on this topic should also bring into the discussion a wider 
range of languages that use this strategy. For example, Plank (2003) briefly dis- 
cusses the very interesting case of Maltese, which makes use of fronting to dis- 
tinguish the superlative degree (110c) from the comparative (110b). 


(110) a. il-belt il-gawwi (Maltese) 
DEF-city DEF-powerful 
“the powerful city’ 
b. il-belt l-aqwa 
DEF-city DEF-powerful.cMP 
“the more powerful city’ 


c. l-aqwa belt 
DEF-powerful.cMP city 


‘the most powerful city’. 


As Plank (2003: 361-362) points out, “Paradoxically, as a result of this fronting, 
NPs with superlatives thus end up less articulated than NPs with other adjectives 
in normal postnominal position.” Plank posits that “Just like le plus jeune homme 
[...] in French, [superlatives in Maltese] are in fact under-articulated: there ought 
to be two definiteness markers on the initial superlative, one by virtue of it being 
a superlative, another by virtue of it being NP-initial” Further issues for future 
work include whether and how the approach we have taken here, in terms of 
competing pressures, can be fruitfully applied to Maltese and other DEF+CMP 
languages. 
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We propose that the phenomenon of definite reduplication in Greek involves using 
the definite determiner D as domain restrictor in the sense of Etxeberria & Gian- 
nakidou (2009). The use of D as a domain-restricting function with quantifiers has 
been well documented for European languages such as Greek, Basque, Bulgarian 
and Hungarian - and typically results in a partitive-like interpretation of the OP. 
We propose a unifying analysis that treats domain restriction and D-reduplication 
as the same phenomenon; and in our analysis, D-reduplication emerges semanti- 
cally as similar to a partitive structure, a result resonating with earlier claims to 
this end by Kolliakou (2004). None of the existing accounts of definites can cap- 
ture the correlations in the use of D with quantifiers and in reduplication that we 
establish here. 


1 Quantifiers, domain restriction, and D 


One of the most fruitful ideas in the formal semantics tradition has been the the- 
sis that quantifier phrases (QPs) denote generalized quantifiers (GOs; see Mon- 
tague 1974; Barwise & Cooper 1981; Westerstáhl 1984; Partee 1986; Zwarts 1986; 
Keenan 1987; 1996; Keenan & Westerstahl 1997; among many others). Classical 
GQ theory posits that there is a natural class of expressions in language, called 
quantificational determiners (Qs), which combine with a nominal constituent (an 
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NP of type et, a first order predicate) to form a quantifier nominal (QP). This OP 
denotes a GQ, a set of sets. In a language like English, the syntax of a QP like 
every woman is as follows: 


(1) a. [every woman] = AQ. Vx. woman(x) — Q(x) 
b. [every] = AP. AQ. Vx. P(x) — Q(x) 
C. OP ett 


o a 


Q etett | NP et 


every woman 


The Q every combines first with the NP argument woman, and this is what we 
have come to think of as the "standard" QP-internal syntax. The NP argument 
provides the domain of the Q, and the Q expresses a relation between this domain 
and the set denoted by the VP. Os like every, most, etc. are known as STRONG, and 
they contrast with the so-called WEAK quantifiers like e.g. some, few, three, many 
(Milsark 1977). 

It has also long been noted that the domain of strong quantifiers is contextu- 
ally (explicitly or implicitly) restricted (see inter alia Reuland & ter Meulen 1989). 
Contemporary work agrees that we need to encode contextual restriction in the 
OP, but opinions vary as to whether contextual restriction is part of the syn- 
tax/semantics (Partee 1986; von Fintel 1994; 1998; Stanley & Szabó 2000; Stanley 
2002; Matthewson 2001; Martí 2003; Giannakidou 2004; Etxeberria 2005; 2008; 
2009; Gillon 2006; 2009; Etxeberria & Giannakidou 2009; 2014; Giannakidou & 
Rathert 2009), or not (Recanati 1996; 2004; 2007 and others in the strong contex- 
tualism tradition). In the syntax-semantics approach, it is assumed that the do- 
mains of Qs are contextually restricted by covert domain variables at LF (which 
are usually free, but can also be bound, and they can be either atomic, e.g. C, 
or complex of the form f(x), corresponding to selection functions; see von Fintel 
1998; Stanley 2002; Martí 2003). Below, we employ C: 


(2 Many people came to the concert last night; every student got drunk. 
(3) vx [ student (x) n C(x) ] — got drunk (x). 


Here, the nominal argument of the universal quantifier every, i.e. student, is 
the set of students who came to the concert last night, not the students in the 
whole world. This is achieved by the domain variable C, which is an anaphor and 
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willlook back in the discourse for a salient property, in this case the set of people 
who came to the concert last night. Every student then will draw values from the 
intersection of student with C. 

Another element that combines with a domain to give a nominal argument is 
the definite determiner, i.e. the English the and its equivalents (including demon- 
stratives), designated as D (Abney 1987; see Alexiadou et al. 2008 for an exten- 
sive overview). The demonstrative is generated in English under the same head 
(thus *this the book). The DP has a structure parallel to (1c), only we have D, and 
the constituent is called DP (though some authors call the Q uniformly D; see 
Matthewson 1998; Gillon 2009). As indicated below, the DP produces a referen- 
tial expression, a (maximal or unique) individual, indicated here with iota: 


(4) DPe: ı(Ax.woman(x)) 


xe NOME 


D ete NP et 


{the/this} woman: Ax.woman(x) 


(5) a. the/this woman = 1 (Ax.woman (x)) 


b. the/these women = max (Ax.woman (x)) 


The DP produces the most basic argument e which can be lifted up to the GQ 
type when necessary. Both D and Q are functions that need a domain, and it 
is the NP that provides this domain. Contextual presuppositions are indicated 
above in the indexing with C. The DP denotes the unique or maximal individual 
presupposed to exist in the common ground. Coppock & Beaver (2015) use 0- 
notation to capture the presupposition of uniqueness as the argument of the 0 
operator: 


(6) Lexical entry: the 
the — AP.Ax [0([P| < 1) ^ P(x)] 


Notice that, contrary to all other approaches, for Coppock & Beaver (2015) 
the is a non-saturated constituent in the referential use. We come back to this 
assumption later. We take it here that the use of D creates a morphologically 
definite argument, it is thus the core of what can be understood as "definiteness". 

DP has been argued to exhibit different types of referentiality. For one thing, 
a DP can be generic and refer to a kind which is itself a very different "object" 
than a concrete unique entity in the world. Observe, in addition, the following: 
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(7) a. John got these data from the student of a linguist. 
b. John went to the store. 
c. Iread the newspaper every day. 
d. Iraised my hand. 


In the examples here the DPs do not make reference to unique entities: the lin- 
guist in (7a) possibly has more than one student; in (7b) the particular identity of 
the store to which John has gone is not important, and the store is certainly not 
unique; (7c) can be used in a context in which no newspaper has been mentioned 
or in which multiple newspapers are read; in (7d) my hand is used to make refer- 
ence to one of my two hands. Poesio (1994) introduced the term “weak definite" 
to refer to such “non-uniquely referential" uses of D (see among others Carlson 
& Sussman 2005; Schwarz 2009; Aguilar-Guevara & Zwarts 2011; Corblin 2013). 
More recent relevant work identifies “sloppy” identity, narrow scope interpreta- 
tion, lexical restrictions (John took the bus vs #John took the coach), restrictions 
on modification, number restrictions, and meaning enrichment (John went to the 
store means that John went to a store to do some shopping) for such non-unique 
DPs (see Carlson & Sussman 2005; Aguilar-Guevara et al. 2014). 

In some languages, the referential strength of DP is reflected in a difference 
between weak and strong forms of D itself (Cieschinger 2006; Puig Waldmüller 
2008; Schwarz 2009). In Standard German, for example, a preposition and the 
definite article can be contracted (zum vs. zu dem). Schwarz (2009) proposes that 
the strong/non-contracted D is used when the noun phrase is anaphoric (a prag- 
matic definite) and it picks up a unique/given referent from the discourse; the 
weak/contracted article is used when the noun phrase has unique reference on 
the basis of its own description. 

In the present paper, we discuss two puzzles of D in Greek and Basque that 
cannot be described by the existing approaches in terms of non-uniqueness or 
weak/strong D. The D in the case we focus on appears in a non-canonical posi- 
tion: (a) on a quantificational determiner; and (b) multiple D structures. Let us 
illustrate the first, which holds also in Salish languages, Hungarian and Bulgar- 
ian. D can be an independent head (Greek, St'át'imcets),! or suffixal D (Basque, 
Bulgarian): 


The St'át'imcets D has a proclitic part (ti for singulars; i for plurals) encoding deictic and num- 
ber morphology, and an enclitic part ...a adding to the first lexical item in the DP (Matthewson 
1998). 
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(8) Greek (Giannakidou 2004: 121) 


a. o kathe fititis 
DET.SG every student 


'each student? 
b. * kathe o fititis 
every DET.SG student 


(‘each student’) 


(9) Basque (Etxeberria 2005: 41-42) 


a. mutil guzti-ak 
boy all-bET.PL 


‘all the students’ 


b. mutil bakoitz-a 
boy each-DET.SG 


'each student 


D 


* mutil guzti / * mutil bakoitz 


boy all / boy each 

(‘all students / each student’) 
d. * mutil-ak guzti 

boy-DET.PL all 

(all the students") 


e. * mutil-a bakoitz 
boy-DET.sG each 


('each boy?) 


(10) St'át'imcets Salish (Matthewson 1999; 2001) 


a. i tákem-a sm’ulhats 
DET.PL all-DET woman 
"all of the women’ 
b i zi7zeg’-a sk’wemk’ük’wm’it 
DET.PL each-DET child(Pr) 
‘each of the children’ 
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(11) Hungarian (Szabolcsi 2010) 


a. 


b. 


C. 


minden diák 
every student 
'every student 


az Osszes diák 
the all student 


‘all the students’ 


* összesaz diák 
all the student 


(‘all the students’) 


(12) Bulgarian (Schürcks et al. 2014) 


a. 


These data, where the D combines with aQ are unexpected under the standard 
analysis of DP because D combines with a Q and not an NP. Hence D above does 
not have the proper input et, and instead combines with the wrong type, a Q (type 
et,ett). That should be ruled out, as it indeed happens in English "the every boy. In 
Greek, Basque, St'át'imcets, Hungarian, or Bulgarian the mismatch is “salvaged”, 
we argued in earlier work, by the ability of D to function as a domain restrictor 
(Giannakidou 2004; Etxeberria 2005; Etxeberria & Giannakidou 2009; 2014). 

In the present paper, we will argue that the domain restriction function of D 
is key to understand the phenomenon of definite reduplication in Greek. This 
phenomenon includes multiple occurrences of D within the same DP: 


vsjako momée 

every boy 

'every boy? 

vsicki-te moméeta 
every-DET.PL boy.PL 
‘all the boys’ 


(13) Greek 


a. 
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to kalo to paidi 
the good the child 


‘the good child’ 


to kalo paidi 
the good child 


‘the good child’ 
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The D-reduplicated structure is puzzling because there is only one referent 
(just like with the simple definite to kalo paidi ‘the good child’); and, just like 
with D on Q, one of the two Ds combines with an adjective, a prima facie non- 
canonical combination. Definite reduplication occurs in other languages, e.g. 
Swedish (but not in Danish, a related language), although in this paper we will 
only concentrate on Greek D-reduplication: 


(14) Swedish 
den gamla mus-en 
the old mouse-DEF 


*the old mouse' 


Although Greek definite reduplications, or polydefinites, as Kolliakou (2004) 
calls them, have received lots of attention in the literature (see Alexiadou & 
Wilder 1998; Campos & Stavrou 2004; Kolliakou 2004; Ioannidou & den Dikken 
2006; Lekakou & Szendroi 2007), there is no consensus on what exactly the 
proper treatment is, with accounts ranging from vacuity of D to close apposition. 
In addition, polydefinites have never been linked to the use of D with quantifiers. 

In our paper, we will connect the two phenomena and argue that they are both 
manifestations of the function of D as domain restriction. The only difference 
between the two is that in one case D applies on Q, but with polydefinites D 
applies on a predicate. At the same time, it is important to note that neither 
of the two phenomena can be captured by the concepts of “weak definiteness" 
or “determinacy” (Coppock & Beaver 2015) used in the literature. Importantly, 
our analysis ofthe two phenomena renders them akin to partitives semantically, 
and from this it follows that partitive structures, domain restriction, and definite 
reduplication are different, but related strategies for partitivity. 

The discussion proceeds as follows. We illustrate first, in $2, the theory of D as 
domain restrictor developed in our earlier work, specifically when D applies to 
Q. In 83, we present the option of D as domain restriction on the NP, an option 
observed in Salish languages. We point out that this option is a direct equivalent 
to a partitive semantically, and then focus on multiple definites ($4). We suggest 
here that multiple definites are the Greek equivalent to the Salish strategy. Our 
analysis is most related to Kolliakou (2004), and predicts a number of behaviors 
consistent with partitivity. 

Our overall conclusion is that “definiteness” is a family of phenomena reveal- 
ing the following functions of D: 
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(15) Types for D 


e Saturating: 
— et — e (iota); intensionalized version (generic) 


e Non-saturating: 
— et ett — et,ett (Dpg on Q) 
- et — et (Dpg on NP or AP) 


"Weak definiteness" D, in contrast to domain restriction, is a saturating func- 
tion, and determinacy (Coppock & Beaver 2015) only relates to the b-version of 
non-saturating D. 


2 Dasa domain restrictor 


In recent work, Giannakidou (2004), Etxeberria (2005), and Etxeberria & Gian- 
nakidou (2009; 2014) proposed that supplying C is a function that D heads can 
perform cross-linguistically. We based this idea on Westerstähl (1984; 1985), who 
argued that the definite article supplies a context set C; our proposal was that 
supplying C actually happens as an overt syntactic strategy in some languages. 
Domain restricting D is a non-saturating, type-preserving (i.e. modifier) function 
that applies to the Q and adds the C variable to the nominal argument of Q. This 
is akin to property anaphora, since C is anaphoric to a property present in the 
context, as we said earlier. Domain restricting D comes in two forms: as a Q 
modifier or as a predicate modifier, found in St'át'imcets and similar languages 
(Matthewson 2001; Gillon 2006; 2009). Definite reduplication, we will argue, is 
the manifestation of the predicate modifier strategy in Greek. 


2.1 D on Q and property anaphora 


Recall the examples mentioned in the introduction. We repeat here only the 
Greek and Basque data for simplicity. Etxeberria & Giannakidou (2009; 2014) 
propose that D here is a modifier function Dpg, defined it as in (18): 


(16) Greek (Giannakidou 2004) 


a. o kathe fititis 
DET.SG every student 


'each student 


b. * kathe o fititis 
every DET.SG student 


('each student") 
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(17) Basque (Etxeberria 2005) 
a. mutil guzti-ak // mutil bakoitz-a 
boy all-DET.pL // boy each-DET.SG 
‘all the students // each student’ 
b. * mutil guzti/bakoitz;*mutil-ak guzti; *mutil-a bakoitz 
boy all/each boy-DET.PL all boy-DET.sG each 
(‘all students / each student; all the students; each student’) 


(18) D to Dpg type-shifting: 
1 Dpgrule: When D composes with Q, use Dpg. 


2. Dpr= AZet ett AP et Haf Z(PnC) (Q); 
Z is the relation denoted by Q 


Dpg is a non-saturating function that definite heads can type-shift to. Above, 
we formulate it as a combinatorial rule Dpg. When D functions as Dpg it intro- 
duces the context set variable C. Dpg does not create a referential expression, 
but is simply a modifier of Q, apparently emerging to fix the mismatch since D 
is fed the wrong type of argument. By supplying C, which is an anaphor, Dpp 
triggers the presupposition that the common ground contains a property to be 
picked as the value for C. Application of Dpg, in other words, creates a presup- 
positional, anaphoric domain for Q, necessitating a discourse familiar property 
to be anchored to. This renders the interpretation of the QP akin to a partitive, 
although it is not morphologically a partitive (for more details, see Etxeberria & 
Giannakidou 2009; 2014). 

Syntactically, we assume that D attaches to Q, so the result is a QP with the 
following structure: 


(19 a. [op Op + katheg [Np fititisyy]] 
b. o kathe fititis = [(C) kathe] (student) ` each student’ 
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b. Greek: o kathe fititis = [(C) kathe] (fititis) 
c. Basque: ikasle guzti-ak = (ikasle) [guzti (C)] 
d. [Q]] = APAR. vx P(x) — R(x) 
e. [D] = AZet ett AP ey ARer Z (P n C) (B); 
Z is the relation denoted by Q 
f. [D(O)] = APAR. vx (P(x)nC(x)) — R(x) 


O kathe ‘each’ and guzti-ak ‘all’ end up being presuppositional Qs since their 
domain will always be anaphoric to C, as a consequence of them being D-restric- 
ted. Crucially, Etxeberria and Giannakidou argue that the composition of each 
(and similar D-universals cross-linguistically) involves a structure parallel to the 
Greek/Basque: [D-every]; only, in contrast to Greek/Basque, with each, D is 
covert. Typologically, D with Qs in Greek, Basque, Hungarian, Bulgarian, and 
St'át'imcents shifts to Dpg, but English the does not, so whether D can function 
as Dpg in a given language is subject to parametrization.? In a language lacking 
a definite article, the shift to Dpr will be done by the closest approximant of defi- 
niteness, e.g. Chinese dou (Cheng 2009), and Korean ku which is a morphological 
demonstrative (Kang 2015). 

In introducing Dpg, we enrich definiteness to include this possibility of D not 
saturating its argument. NPs preceded by the definite article (definite descrip- 
tions) are referential expressions, which, since the classical treatments of Russell 
(1905), Strawson (1952), and Heim (1982) are known to denote familiar unique en- 
tities. In many accounts, reference and familiarity are considered the core prop- 
erties of a definite description, while uniqueness is a derived one (informational 
uniqueness in Roberts 2003; see also Ward & Birner 1995; Elbourne 2005; Lud- 
low 2007 for counterexamples to uniqueness, and Schwarz 2009 suggesting that 
in German familiarity and uniqueness can be distinguished). In other theories, 
uniqueness is the core, as in the account by Coppock & Beaver (2015) who argue 
that "definiteness is a morphological category which, in English, marks a (weak) 
uniqueness presupposition, while determinacy consists in denoting an individ- 
ual" (Coppock & Beaver 2015: 377). 

Like us, Coppock & Beaver (2015) propose a non-saturating denotation for the, 
with the uniqueness presupposition designated by the 0 operator: 


"But why do we have this contrast in the ability of D to perform Dpg? Could it be a random fact 
about Ds across languages? Could it relate to availability of repair strategies more generally? 
Clearly, whether aD can perform Dp, cannot be due to the morphological status of D since, as 
shown earlier, Greek o and English the are similar, independent heads and monosyllabic. Greek 
o, however, is phonologically weaker than English the, so perhaps phonological weakness is a 
factor. Suffixal Ds like the Basque D are phonologically weaker too, clitic-like Ds. 
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(21) Lexical entry: the 
the — AP.Ax [0([P| < 1) ^ P(x)] 


(22) Ax[@(|Moon|s1)AMOON(x)] 


APAx[O(\P|<1)AP(x)] Axmoon(x) 


the moon 


The moon denotes the property of being a moon, defined only if there is no 
more than one moon. This analysis, like our Dpg, does not saturate the NP ar- 
gument, and referential closure happens on top of that, by a covert type shifter. 
This amounts to saying that D itself is not referential in this basic use. Our D plus 
Q data remain mysterious under this analysis. (Also mysterious remain weak 
definite data where uniqueness appears to be systematically violated). Roberts's 
theory of definiteness, on the other hand, seems to provide a more appropriate 
frame for domain restriction. 

Roberts (2003) argues that definites conventionally trigger two presupposi- 
tions: one of weak familiarity, and a second one called INFORMATIONAL UNIQUE- 
NESS. These are the informational counterparts of Russellian existence and 
uniqueness, respectively. 

Roberts (2004) argues that the same presuppositions characterize the meaning 
of pronouns and demonstratives (Roberts 2002). In more recent work (Roberts 
2010) a Gricean view is developed which permits a simplification of her earlier 
theory in that the uniqueness effect observed in certain contexts follows from 
retrievability, with no need to stipulate even informational uniqueness. The re- 
sulting theory stands in contrast to a number of other recent treatments of defi- 
nites (Neale 1990, as well as those that treat definites as E-type or D-type implicit 
descriptions Heim 1990; Elbourne 2005; inter alia; Coppock & Beaver 2015, see 
also Fara 2001). For the purposes of this paper, it is not necessary to dwell in the 
details of this discussion; we will concentrate on the main theses of Roberts's 
theory that are essential to our analysis of Dpp: 


(23) a. English Definite NPs: definite descriptions, personal pronouns, 
demonstrative descriptions and pronouns, proper names. 


b. Semantic Definiteness: A DP is definite if it carries an anaphoric 
presupposition of weak familiarity. 
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. Weak familiarity: Weak familiarity requires that the existence of the 


relevant entity be entailed in the common ground. Existence 
entailments alone are sufficient to license introduction of a discourse 
referent into the context. Weak familiarity does not mean previous 
mention. Previous mention is strong familiarity. 


. The antecedent of an anaphoric expression is the discourse referent 


which satisfies its anaphoric presupposition. 


. Anaphora and weak familiarity do not presuppose a linguistic 


antecedent. 

Pronouns, unlike definite descriptions, carry the additional 
presupposition that the discourse referent which satisfies their 
presupposition is maximally salient at that point in the discourse. 
This explains why uniqueness effects do not arise with pronouns. 


In other words, 


The notion of familiarity involved [in definites] is not that more commonly 
assumed, which I will call strong familiarity, where this usually involves ex- 
plicit previous mention of the entity in question. Rather, I define a new no- 
tion, that of weak familiarity wherein the existence of the entity in question 
need only be entailed by the (local) context of interpretation. [...] Gricean 
principles and the epistemic features of particular types of context are in- 
voked to explain the uniqueness effects observed by Russell and others. 
(Roberts 2003: 288) 


The notions of hearer old versus discourse old have also been used (Prince 1981; 
Ward & Birner 1995) to distinguish different "shades" of familiarity. 
The definiteness criterion is thus the anaphoric presupposition of weak famil- 


iarity, and some definites will further need prior mention (strong familiarity). 


Our idea that D in Dpp supplies a context set C, renders Dpp a case of property 
anaphora, since C targets a familiar property in the common ground. In Dpg, D 
is a signal that such a property exists in the common ground. This renders the 
D-restricted QP similar to a partitive (every one of the students), since this is the 
typical structure where the NP domain is presupposed. 

We move on now to provide some syntactic arguments for our direct compo- 
sition of D with Q. 
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2.2 Dpr does not produce a syntactic DP 


The application of Dpg, as we envision it, is a type shifting rule; but we could 
also think of it as a lexical modification of Q. In either case, a type shifting or 
lexical rule would not make us expect that the product will alter the category 
of Q: we have a QP and not a DP. However, one could ask: how do we know 
that Greek o kathe or Basque guzti-ak (and the rest of Basque strong Os that can 
be modified by D; Etxeberria 2005; 2009) do not create DPs? These are certainly 
attested structures: 


(24) a Greek 
[I [tris fitites pu irthansto  parti]], itan endelos 
[the [three students that came to.the party]] were completely 


methismeni. 

drunk 

"Ihe three students that came to the party were completely drunk’ 
b. Basque 

[Festara etorri ziren hiru ikasle] -ak] erabat 


[to.the.party came Aux.Pr three student] -DET.PL] completely 
mozkortuta zeuden. 
drunk were 


"Ihe three students that came to the party were completely drunk’ 


These are referential DPs. The output is of type e, and not a GQ, which is the 
output of the Dpg structure, as we argued. What are the arguments that our 
Dpp structure is not a DP of this kind? Etxeberria & Giannakidou (2014) offer a 
number of arguments which we summarize here.? 

Apart from the obvious fact that to kathe agori 'each boy' is a quantificational 
expression, evidence that D in o-kathe does not create a DP comes from two facts. 
First, [o-kathe NP] cannot co-occur with the demonstrative pronoun (aftos ‘this’, 
ekinos ‘that’) - which in Greek, like in many other languages, must embed DPs 
(Stavrou 1983; Stavrou & Horrock 1989; Alexiadou et al. 2008):* 


3Etxeberria (2005; 2009) excludes the hypothesis that Basque Qs that combine with the D are 
adjectives. The reader is referred to these works for extensive discussion on this point. 

“The Greek test on the impossibility of demonstratives and the D-restricted o kathe Greek can- 
not be used in Basque because the D and the demonstratives appear in the same syntactic 
position D (we exemplify in (i) only with the singular). 
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(25) Greek 
a. aftos"(o) fititis 
this the student 
“this student’ 
b. ekinos"(o) fititis 
that the student 


‘that student’ 


(26) Greek 
a. afti /ekini i tris _fitites 
these / those the three students 
‘these / those three students’ 


b. aftos/ ekinoso enas fititis 
this /that the one student 


‘this / that one student’ 


(27) Greek 
“aftos /*ekinoso kathe fititis 
this / that the every student 


(Lit. “This / that each student’) 


The demonstratives aftos/ekinos are not D heads in Greek, but phrases in [Spec, 
DP] (Stavrou & Horrock 1989). Since the demonstrative cannot occur with o kathe, 
we must conclude that the phrase headed by the D-kathe is not a DP. 


(i) Basque 
a. ikasle-a 
student-DET.sG 
‘the student’ 


b. ikasle hau/hori/hura 
student DEM.SG.PROXIMAL/MEDIAL/DISTAL 


‘this/that/that student’ 


c. * ikasle-a hau/hori/hura 
student-DET.SG DEM.SG.PROXIMAL/MEDIAL/DISTAL 


(‘this/that/that student’) 
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The second piece of evidence that o kathe NP does not behave syntactically as a 
DP comes from the fact that it cannot reduplicate. Polydefinites, as we mentioned 
in 81, are pervasive in Greek (see Alexiadou & Wilder 1998; Campos & Stavrou 
2004; Kolliakou 2004; Ioannidou & den Dikken 2006; Lekakou & Szendroi 2007): 


(28) Greek 
o kokinos o  tixos 
the red.Nom the wall.NoM 


“the wall that is red’ 
Reduplication is not possible with o kathe, but it is with a numeral: 


(29) Greek 
a. "o katheo fititis 
the each the student 
('each student") 


b. o enaso fititis 
the one the student 


*the one student? 

c. i tris i fitites 
the three the students 
‘the three students’ 


These are, in fact, equivalent semantically to partitives, a point to which we 
return: 


(30) Greek 
a. enas apo tus fitites 
one of the students 
‘one of the students’ 


b. tris apo tous fitites 
three of the students 


‘three of the students’ 


In a language where DPs duplicate easily, the impossibility of reduplication 
with o kathe suggests again that o kathe is not a DP. 

A third argument against the DP analysis comes from Basque, where it is pos- 
sible to conjoin two NPs or two APs under the same single D, as shown as shown 
in (31) and (32) (in Greek this is not possible, so we cannot apply this test). 
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(31) Basque: NP conjunction 


[pp [np Ikasle] eta [np irakasle] -ak] azterket-a garai-a-n 

[ [ student] and[ teacher] -D.Pr.ABs] exam-D.sc period-D.sG-IN 
daude. 

AUX.PL 


"Ihe students and teachers are in exams period. 


(32) Basque: AdjP conjunction 
Maiak [pp [aajp zaldi haundi] eta [Aajp elefante txiki] -ak] 
Maiaerg[ [ horse big] and [ elephant small] -DET.PL.Ags] 
ikusi ditu. 
see AUX.PL 


‘Maia has seen the big horses and small elephants: 


If Basque strong Os created DPs, we predict that we should be able to conjoin 
two strong Os under the same D; but this is impossible as shown by the following 
examples: 


(33) Basque 


a. * [pp [gp Ikasle gehien] eta [op irakasle guzti] -ak] goiz 
[ [ student most] and[ teacher all] -DET.pPL.ABs] early 
iritsi ziren. 
arrive AUX.PL 
Intented: ‘Most of the students and all of the teachers arrived early: 


b. *[pp [gp Neska bakoitz] eta [gp mutil guzti] -ek] sari bat 
[ [ gil each] and[ boy all] -DET.PL.ERG] prize one 
irabazi zuten. 

Win AUX.PL 


Intended: ‘Each girl and all of the boys won a prize: 


These sentences show that Basque strong Qs create QPs and not DPs headed 
by D (see Etxeberria 2005; 2009 for extensive discussion; for Greek o-kathe, more 
recent discussions are found in Lazaridou-Chatzigoga 2012, Margariti 2014). 

We thus conclude that D-restricted Os do not create referential DPs, unlike 
the combination of D with a weak numeral. Since D in Dpg is a modifier and a 
head, the simplest thing to assume is, as we do, that D adjoins to Q. Recall that, 
as we said, we can envision this as a lexical or morphological operation. Another 
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option would be to move D from a lower position and adjoin it to Q in a structure 


like [OP[DP[NP]]]: 


604)  QP 
ER 


Q DP 


ER 


D NP 


In this case, we get again a OP since Q would be in a structurally higher posi- 
tion; hence both movement of D from a lower to a higher position and our direct 
adjunction analysis allow D to function as a Q-modifier. In definite reduplication, 
as we shall see, we clearly observe instances of D in lower position. In this analy- 
sis, therefore, a structural parallelism with partitivity is more observable. Given 
that the lower D position is indeed for Dpg in Greek, as we will argue next, it 
seems reasonable to keep it as an analytical option. 

We move on now to the St'át'imcets Salish data which illustrate the other 
incarnation of Dpg applying to a predicate. This is a lower D, and will be the 
variant needed for Greek D reduplication, we will argue. 


3 Dpg on the NP: Partitive meaning 


St'át'imcets Salish does not have a definite article, but possesses a morpholog- 
ically deictic D (Matthewson 1998; 2008; see Gillon 2006; 2009 for Squamish, 
another Salish language). This D, Etxeberria & Giannakidou (2009; 2014) argue, 
functions as the Greek and Basque D in Dpg, but can also function as Dpg when 
applied to the NP argument. The result is again introducing the anaphoric vari- 
able C, yielding a contextually salient set of individuals characterized by the 
[NPnC] property: 


(35) D to Dpg type-shifting: 


1 Dpr rule: When D composes with NP under Q, use Dpp. 
2. [Dpr] = APa Ax (P(x) n C(x) 


(36) i.ain Dpg 
[i...a]] = AP, Ax (P(x) n C(x)) 
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As noted in Giannakidou (2004), Dpr works in this case like Chung & Ladusaw 
(2003)'s Restrict: it does not saturate the NP argument (i.e. it does not close it 
under iota), but only restricts it via C. It works like a modifier, as in Dpg on the 


Q: 
(37) St'átimcets Salish 
a. Lexlex [tákem-a i smelhmülhats-a]. 
intelligent [all DET.PL woman.PL-DET] 
"All of the women are intelligent! 


b. * Léxlex [tákem-a smelhmülhats]. 
intelligent [all woman.PL] 


(All of the women are intelligent?) 
(38) * every the woman 


(39) | Greek 
"kathei gynaika 
every the woman 


€ > 
(‘every woman’) 


Having Dpg as an NP modifier is consistent with the idea of a lower DP layer, 
as we mentioned earlier (see Szabolcsi 1987; 2010, and works cited in Alexiadou 
et al. 2008). If St'át'imcets D is Dpg, the Salish structures are not as peculiar 
as initially appearing, but illustrate a systematic grammaticalization of domain 
restriction via D. However, D on NP is generally not allowed in English, Greek 
and Basque: 


(40) a. * every the boy 
b. * most the boys 
c. * many the boys 
d. * three the boys 


(41) Greek 
a. * kathe to aghori 
every the boy 
(‘every boy’) 
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b. * merika ta aghoria 
several the boys 


(‘several boys’) 


c. *tria ta aghoria 
three the boys 


(three boys’) 


WhenD is fed an NP, it functions referentially in European languages; hence 
the need for the partitive preposition (Greek apo, Basque ablative -tik, etc.) to 
give back the right input (et) for composition with Q, e.g. ikasle-eta-tik asko, lit.: 
students-D-of many; ‘many of the students’: 


(42) Greek 


a. merika apota aghoria 
several of the boys 


‘several of the boys’ 


b. tria apota aghoria 
three of the boys 


“three of the boys’ 


As Matthewson notes, the Salish DP structures are equivalent to the partitive 
PPs semantically. In Greek (and Basque) then, the morphological partitive is the 
way to do domain restriction on the NP argument (inside quantifier phrases); 
and we correlated this in our earlier work with the observation that St’at’imcets 
lacks partitive constructions. In European languages, we argued, the partitive is 
the analogue of the St'át'imcets Q with the Dpr restricted NP. This correlation 
between partitivity and Dpg is key, as we show in the next section, to under- 
standing the nature of multiple definites. 

We close this section with a few typological remarks. We have added Dpr 
as a possible functions of definites. DEFINITENESS thus emerges as a family of 
functions of D: 


(43) Types for D 
e Saturating: 
— et — e (iota); intensionalized version (generic) 


e Noncsaturating: 
- et ett — et,ett (Dpg on Q) 
- et — et (Dpp on NP or AP) 
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The main division is between saturating (referential) and non-saturating types. 
Dpg belongs to the later, as shown. WEAK DEFINITES discussed in the literature 
are saturated thus referential, and determinacy, as understood in Coppock & 
Beaver (2015) only relates to the b-version of non-saturating D. Our point about 
Dpg is that D functions as a generalized modifier, applying not to just nouns but 
also quantifiers and, as we will show with D reduplication, adjectives. 

Finally, it is not even necessary in our analysis that Dpg be performed strictly 
speaking by the definite article. Greek, Basque, Bulgarian and Hungarian, are all 
languages that have a definite article and employ it for Dpg. Why the definite 
article and not a demonstrative? Because the definite article is phonologically 
weak (a suffix in Basque and Bulgarian, and monosyllabic in Greek, Hungarian), 
whereas the demonstrative is typically a strong head (it is heavier lexically, it can 
stand alone as a phrase, compare the and this: *read the versus read this). In lan- 
guages like St'át'imcets and Korean (Kang 2015) that have deictic D but no article 
distinction, the demonstrative performs Dpg (see more arguments in Etxeberria 
& Giannakidou 2014 that St'át'imcets D is deictic). In case, finally, that a lan- 
guage lacks D altogether, if there is some element that encodes familiarity, that 
element will function as Dpg. The data reported in Cheng (2009) about Chinese 
dou confirm this prediction: dou is not a D, but according to Cheng it functions 
as Dpr, while also functioning as the iota operator when used with free choice 
items (Giannakidou & Cheng 2006). 


4 Definite reduplication as involving Dpr 


4.1 Multiple Ds with single reference 


The phenomenon of definite reduplication is pervasive in Greek (Alexiadou & 
Wilder 1998; Campos & Stavrou 2004; Kolliakou 2004; Ioannidou & den Dikken 
2006; Lekakou & Szendroi 2007): 


(44) Greek 
a. to kalo paidi 
the good child 
“the good child’ 


b. "ro paidi kalo 
the child good 
(the good child’) 
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c. to kalo to paidi 
the good the child 


'the good child' 


d. to paidito kalo 
the child the good 


‘the good child’ 


e. * paidito kalo 
child the good 


(‘the good child’) 


In the simple monadic definite, the adjective must precede the noun; this is the 
canonical structure. In the polydefinite construction, one D appears combined 
with the noun whereas a second D combines with the adjective. The order now 
is free, as we see. The major puzzle posed by these [DP+DP] structures is: why 
have them if they are equivalent to simple definites? We will argue here that they 
are not. 

The polydefinite structures are sometimes thought to express a predication 
relation between the two DPs, and the sentence would be translated as something 
like ‘the child who/that is good’ (Alexiadou & Wilder 1998; Campos & Stavrou 
2004). But it has generally been quite difficult in the literature to disentangle the 
pragmatic differences between monadic and polydefinites. 

The order of the elements inside these polydefinites is quite free as we saw, 
and observe further the following examples: 


(45) Greek 
a. to palioto spiti to petrino 
the old the house the stone-made 
‘the old house made of stone’ 


b. to palioto petrino to spiti 
the old the stone-made the house 


‘the old house made of stone’ 
c. to spiti to palioto petrino 
the house the old the stone-made 


‘the old house made of stone’ 


The definite reduplication phenomenon only happens with D; the indefinite 
article results in ungrammaticality: 
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(46) Greek 


a. “enakalo ena paidi 
a gooda child 
(‘a good child’) 

b. * ena palio ena spiti ena petrino 
a old a stone-madea house 


(‘an old house made of stone’) 


The D with the noun seems to form the referential core of the structure, i.e. the 
DP that refers to an object. The combinations of D with the additional adjectives 
are non-referring, and perform Dpg, we will claim. Crucially, the phenomenon 
cannot be reduced to weak definiteness as we know it from the literature. 


4.2 Multi-D structures, partitives, and Dpr 


Our analysis will be that the secondary, adjectival uses of D are applications of 
Dpg on a predicate, with the ensuing partitive interpretation. Kolliakou (2004), 
as far as we know, is the first to make a clear connection between definite redu- 
plication and partitive interpretation: 


Though in both to kokino podilato [the red bike] and to kokino to podilato 
[the red the bike] the same property ‘red bike’ is uniquely instantiable [in 
the resource situation], only in the latter case is the index anchored to an 
entity that is a proper subset of a previously introduced set. (Kolliakou 2004: 
308, emphasis ours) 


Kolliakou continues that: 


The polydefinite to kokino to podilato, is, therefore, semantically identical to 
the monadic to kokino podilato, whereas the special pragmatic import of the 
former originates from an additional contextual restriction on the anchoring 
of the index that interacts with the common morphosyntactic and semantic 
basis. (Kolliakou 2004: 265, emphasis ours). 


Our take of this idea is that one D is referential, the other(s) perform Dpg. 
While the D plus NP introduces a referent, the additional D combining with 
adjectives performs domain restriction, and the multi-D structure is akin to a 
partitive. 

To understand that the multi-D structure picks out a proper subset of a set 
introduced in discourse, consider a uniqueness context where there is only one 
bike and it is red. In this context, reduplication is odd: 
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(47) Greek 
a. # To kokkino to podhilato mou aresei poli! 
the red the bike me like.3sG much 
‘I like the red bike a lot!’ 
b. To kokkino podhilato mou aresei poli! 
the red bike me like.3sG much 
‘I like the red bike a lot!’ 


Consider now maximal contexts where there is no subset: 


(48) Greek (Kolliakou 2004) 
Idame tis dilitiriodis (#tis) kobres. 
saw.1pL the poisonous the cobras 


"We saw the poisonous cobras. 


(49) Greek (Campos & Stavrou 2004) 
# Tous epikindinous tous kakopious prepina tous apofevgeis. 
the dangerous the criminals must suBj them avoid 


"You must avoid the dangerous criminals. 


The polydefinites are odd because all cobras are poisonous and all criminals 
are dangerous. In both the unique and the maximal context partitive readings are 
impossible, and reduplication is impossible too. 

Campos & Stavrou (2004) suggest that polydefinites only have intersective 
readings, see (50b). Compare them with regular DPs in (50a): 


(50) Greek 


a. Gnorises tin orea tragoudistria? 
met.2sG the beautiful singer 
‘Did you meet the beautiful singer?’ 
P the singer who sings beautifully 
P the singer who is beautiful 
b. Gnorises tin orea tin tragoudistria? 
met.2sG the beautiful the singer 
‘Did you meet the beautiful singer?’ 
* the singer who sings beautifully 
P the singer who is beautiful 
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This fact can be interpreted as further supporting the partitive interpretation 
because the non-intersective reading requires either intensionalization or quan- 
tification over events, in either case going beyond the set of physically beautiful 
singers. 

Finally, consider that partitives with adjectives in Greek are generally quite 
odd. Compare the adjectival partitives with the numeral partitive (which we en- 
countered before). It is fair to generalize that adjectival partitives are odd in En- 


glish too: 


(51) Greek 
Context: In front of us there are red, blue and yellow bikes. 
a. Dyo/ Merika apo ta podhilata einai gallika. 
two /severalof the bikes are French 


"Two / several of the bikes are French: 


b. ?? Ta kokkina apota podhilata einai gallika. 
the red (ones) of the bikes are French 


“The red ones of the bikes are French: 


c. Ta kokkina ta podhilata einai gallika. 
the red the bikes are French. 


“The red bikes are French. 


The definite reduplication looks like a strategy in Greek to try to form a par- 
titive with an adjective, an option not available with the partitive preposition. 
The inability of (51b), which holds in English too, is in fact quite interesting, indi- 
cating that an adjective, unlike a numeral, is not a very good device to establish 
the part-of relation. Notice that Greek licenses nominal ellipsis with adjectives 
(ta kokkina = ‘the red ones’, see Giannakidou & Merchant 1997; Giannakidou 
& Stavrou 1999), and the ones version is still odd in English. Hence, the problem 
with potential adjectival partitives seems to be not with ellipsis or its equivalents; 
it is rather of a semantic nature. An adjective is not a good device to be used in 
the partitive structure because it is not a quantity expression and therefore can- 
not designate a proper subset (as required by partitivity). Quantity expressions 
such as numerals and quantifiers are the best devices because they are indeed 
quantity expressions. 

Our proposal is that definite reduplication involves the Dpg function on a 
predicate, just like in Salish. And given that with adjectives there is no parti- 
tive alternative, the structural parallelis exactly the same (recall the Salish lacks 
partitives). The structure is as follows: 
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(52) Greek 


a. to kokkinoto podhilato 
the red the bike 


b. DP[i(Ax(bike(x))nC(x)nred(x))] 


D AP[Ax(bike(x))nC(x)nred(x)] 


| e Ee 


to Adj DP[Ax(bike(x))nC(x)] 


| m el, 


kokkino Dpp[AP,;Ax(P(x))nC(x)] NP[Ax(bike(x))] 


to podhilato 


As we see, the top D functions referentially, to saturate the predicate, now 
domain restricted via Dpg coming from below. Since the order permutates syn- 
tactically, and since intersection is commutative, it doesn't matter which predi- 
cate (the adjective or the noun) undergoes Dpp. In fact, the free permutability of 
the structure can be seen as an argument in favour of the modifier analysis. The 
top D saturates, while any lower Ds perform Dpp. If we have more than two DP 
layers (as in to spiti to palio to petrino (lit. ‘the house the old the stone-made’)) 
we assume that there will be an identity relation between the Cs contributed by 
each application of Dpp. C, finally, as is typically the case, will have to refer to a 
non-singleton set, hence the partitivity effect. 

The simple monadic definite, on the other hand, lacks C and there is no parti- 
tive effect. 


(53) to kokkino podhilato (‘the red bike’) = 1 (red(x) n bike (x)). 


The partitive effect can be reinforced by focus as discussed further in Kolliakou 
(2004), e.g. in contrastive contexts: to kokkino to podhilato, oxi to ble 'the red bike 
not the blue one. 

What we are suggesting here, namely application of Dpg at the lower level(s), 
renders, as we said, the reduplication structure of Greek akin to the Salish DP 
strategy. Crucially, as in Salish, the structure of reduplication is not that of a 
partitive, i.e. it does not involve a PP, just like in Salish. There must be agreement 
in case and number, just like with all nominals in Greek (we thank a reviewer for 
asking this question). 
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Dpg has been suggested further for certain D+adjective combinations found 
in Slavic (Schürcks et al. 2014, Marušič & Zaucer 2014). In Slavic languages, so- 
called long-adjectives are usually interpreted as definites with D i combining 
only with the adjective, not the noun: 


(54) Serbian 
a. lep grad 


beautiful town 

‘a beautiful town’ 
b. lep-i- grad 

beautiful-DEF town 

‘the beautiful town’ 
c. * lep grad-i 

beautiful town-DEF 


(‘the beautiful town’) 


In Slovenian, there are similar phenomena. We will not delve into more detail 
here, but simply want to note that the strategy of Dpr on the adjective is possible 
in other Balkan Sprachbund languages. 


4.3 Comparison with other approaches 


The Dpg analysis we proposed seems to be an adequate and simple enough analy- 
sis of the polydefinite structure. Other alternatives such as for instance the close 
apposition analysis proposed by Lekakou & Szendroi (2007) cannot capture some 
of the key properties of the structure: 


(55) Greek 


a. o aetoso puli 


the eagle the bird 
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Reduplication as close apposition: 


(56) Greek 
a. o spiti to petrino 
the house the stone 


b DefP 
c 
Def DÉI: 
Ne 
Ø DÉI DP, 
EP a 
D NP D NP 
LE T 
o spiti o AP N 
I | 
petrino Ø 


For this analysis to work, a number of assumptions must be made. First, we 
need to assume definiteness “concord” (å la Zeijlstra 2004); but there is no ex- 
planation why reduplication is optional whereas concord is obligatory. And a 
concord analysis would render the difference between a monadic definite and 
a polydefinite semantically vacuous, missing the partitive and anti-uniqueness 
effects observed, as well as the correlation with the impossibility of the partitive 
with adjectives that we noted. The concord/apposition account, finally, fails to 
unify reduplication with the D on Q. 

Our analysis does precisely that. It unifies definite reduplication with the Dpp 
strategy on a predicate and says that polydefinites fall under the phenomenon 
of domain restriction, which involves a modifier function of D. It turns out, then, 
very interestingly, that Greek has both options of Dpg. Two open questions are: 
(a) why Basque doesn't exhibit the D-reduplication strategy, and (b) whether our 
Dpr analysis can extend to capture D-reduplication in other languages (e.g. in 
Swedish, noted earlier). We will leave the latter as a prediction of our theory, to 
be tested in future research. 
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5 Conclusions 


As a summary of our discussion, we proposed here a modifier analysis Dpg of D 
heads cross-linguistically that includes the following two options: 


(57 D to Dpg type-shifting: 


1 Dpr rule: When D composes with Q, use Dpg. 


2. Dpr= AZet ett AP et Haf Z(PnC) (Q); 
Zis the relation denoted by Q 


The domain restricting function is a non-saturating use of D as a modifier 
(Dpg): and if our analysis of Greek definite reduplication is correct, Greek also 
has the option of Dpg on the predicate, just like Salish. 

Clearly, given the data from Greek, Basque and Salish languages in contrast to 
English, a fair question to ask is what determines, in each language, whether the 
available D will have the option to function as a modifier or not. As we suggested 
already, the difference doesn't follow from the morphological status of D since 
Greek o and English the are both independent heads and monosyllabic. Greek 
o, however, is phonologically weaker than the, therefore phonological weakness 
may be a factor, as we noted earlier. Suffixal Ds are phonologically weaker too 
since they are clitic Ds; hence, if phonological weakness is a decisive factor, we 
expect to find more Dpp in languages with suffixal Ds. 

Finally, our analysis of D reduplication as Dpg strengthens our initial link 
between Dpg and partitivity, and suggests that it is actually quite general. By 
introducing C, Dpg creates partitivity in all cases, since NP intersected with C 
will be as subset of NP. The domain after Dpp is therefore always a subset of 
a larger domain. Hence, partitivity is present even in the case of application of 


Dpr to Q. 
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Definiteness across languages 


Definiteness has been a central topic in theoretical semantics since its modern founda- 
tion. However, despite its significance, there has been surprisingly scarce research on 
its cross-linguistic expression. With the purpose of contributing to filling this gap, the 
present volume gathers thirteen studies exploiting insights from formal semantics and 
syntax, typological and language specific studies, and, crucially, semantic fieldwork and 
cross-linguistic semantics, in order to address the expression and interpretation of def- 
initeness in a diverse group of languages, most of them understudied. The papers pre- 
sented in this volume aim to establish a dialogue between theory and data in order to 
answer the following questions: What formal strategies do natural languages employ to 
encode definiteness? What are the possible meanings associated to this notion across 
languages? Are there different types of definite reference? Which other functions (be- 
sides marking definite reference) are associated with definite descriptions? Each of the 
papers contained in this volume addresses at least one of these questions and, in doing 
so, they aim to enrich our understanding of definiteness. 
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